Datafeed Vendor Comparison

bpr

Well-Known Member
#61
Conclusion so Far

1) Most datafeed vendors are using NSE Level 1 datafeed

2) All trading terminal is provided with Level 2 datafeed. Of course they don't have data vending license and so not allowed to extract data out of their platform. it is only provided for trading.

3) Only NSE tick by Tick data will produce the 100% accurate Candlestick chart with accurate OHLC values.

4)In any other types of datafeed there is always chance of inaccuracies. The candelstick price nothing but bunch of ticks. But say if a tick is missing towards the high and low of the candle then the high, low values will be inaccurate for the candle. There is also possible we are missing a lot of ticks within the high low of the candle but it does not affect the candle.
The same issue will occur for open and close of a candle. During the precise time when a candle is closing and the next candle is opening if some ticks are missing the open and close prices will be inaccurate.

5) There is also no guarantee that 2 data vendors using same NSE level 1 datafeed will match as we have seen for some comparison. This may be how the individual vendors handle the data themselves. or NSE distribution of the data.

6) NSE Tame charts data is also inaccurate must be using the NSE level 1 data.

7)For charting Level 1 datafeed Vs Level 2 datafeed Vs Level 3 datafeed are all same if we are concerned about candelstick chart and Volume.

The only difference is that Level 2 and Level 3 datafeed contains additional marketdepth info upto 5 ,20 places respectively which is used for advanced volume profile analysis. Not used by Indian traders :eek: so no difference.

8) Comparison between different data vendors is futile because we don't know who is correct vs who is incorrect, as there is no reference point.
If a data vendor will provide Tick by Tick data charts that can become the reference point.
 
Last edited:

bpr

Well-Known Member
#62
@bpr

Yes totally agree with you. Disclaimers should be added. At the end of discussion, which vendor is accurate is the question?

Even Zerodha Nest traders did not detect 1487, so who is correct now?

Esignal is best as per various ppl so just did a comparison with it. It is very hard to afford a esignal data feeds, only CEO's, big time traders use Esignal.

@chartanalyst

Can you put the comparison b/w GDFL and Esignal or share the data in xls format to do the comparison
nobody is 100% accurate. comparison is meaningless.
Neotrade is not bad for the price. you can continue to use them.
 

mastermind007

Well-Known Member
#63
Any other viewpoint apart from fxgood's for the data discrepancies?
I disagree with fxgood. Indian exchanges data services are not sub-standard. Yes, customer service and ethics are at times lacking but infrastructure in India is OK.

I've written code for data sampling (cannot go too much into detail but can offer this)

Essentially at heart of all sampling logic, there will be a time interval (in few hundred mili seconds) and usually this is a configurable parameter. Lets call this t

At start T, price is captured (first price moves to O, H L and C, volume is set to v)

At T + t, price is captured (C is set, High and low are set if necessary, volume is set to differential between current and earlier) and so on!!!

Over time, output of data at sampling rate of 100 milliseconds could be different from sampling at 101 milliseconds for very fast moving ticks.

There are 22500 seconds in NSE's trading session so that sets upper limit of number of ticks at 1 second at 22500.

With sampling rate of at 100 milliseconds, potential data size of 2,25,000 rows. If you consider average row size of 100 bytes. its 2MB worth of data per scrip per day.

We capture 250 scrips so it gobble up 500MB+ per day. Hard drive is cheap but not that cheap!!!

Comparing data feeds by ticks in excel will only muddle the issue and lead to sleepless nights. You'll never be able to arrive at a level to confidently grade a Vendor.
 

mastermind007

Well-Known Member
#64
Thanks TradeOptions

Looks like esignal is also missing the tick

This is anilsingh's contract note which did show the trade happen at 1487 at around 11:02:53 AM



So the one minute chart the11:03 candle should have a high value of 1487.

EDIT: or may be the 11:02 candle anyways the high value is not captured.

Esignal data



I am waiting for the Neotrade data
Also if some GFDL subscriber post this data for this same issue that would be great.

Instead of labeling data source as inaccurate, these are few more conclusions that can be inferred from this

a) This could be a gala trade by the broker, esp. likely if Mr. Singh had placed at-market order.
b) Broker could have transacted (sold) from its own pool of holdings and never reflected the trade to exchange.
 
Last edited:

bpr

Well-Known Member
#65
Instead of labeling data source as inaccurate, these are few more conclusions that can be inferred from this

a) This could be a gala trade by the broker, esp. likely if Mr. Singh had placed at-market order.
b) Broker could have transacted (sold) from its own pool of holdings and never reflected the trade to exchange.
a) what is gala trade? this trade is a SL trade. Either a SL-L or SL-M

b) this is Futures trade. Also There is no dark pool in NSE and All trades must be reflected at exchange as per zerodha. The trade has a order number which can be verified at the exchange.
 
Last edited:

bpr

Well-Known Member
#66
I disagree with fxgood. Indian exchanges data services are not sub-standard. Yes, customer service and ethics are at times lacking but infrastructure in India is OK.

I've written code for data sampling (cannot go too much into detail but can offer this)

Essentially at heart of all sampling logic, there will be a time interval (in few hundred mili seconds) and usually this is a configurable parameter. Lets call this t

At start T, price is captured (first price moves to O, H L and C, volume is set to v)

At T + t, price is captured (C is set, High and low are set if necessary, volume is set to differential between current and earlier) and so on!!!

Over time, output of data at sampling rate of 100 milliseconds could be different from sampling at 101 milliseconds for very fast moving ticks.

There are 22500 seconds in NSE's trading session so that sets upper limit of number of ticks at 1 second at 22500.

With sampling rate of at 100 milliseconds, potential data size of 2,25,000 rows. If you consider average row size of 100 bytes. its 2MB worth of data per scrip per day.

We capture 250 scrips so it gobble up 500MB+ per day. Hard drive is cheap but not that cheap!!!

Comparing data feeds by ticks in excel will only muddle the issue and lead to sleepless nights. You'll never be able to arrive at a level to confidently grade a Vendor.
You know I would not mind the extra space if that is what it takes to get a correct chart.
It is easy to get behind the sampling excuse and say they will all be different.

The point is we need correct OHLC chart for a particular timeframe. We don't need TBT(Tick by Tick) if we don't have to but without TBT there is no correct chart.

I strongly feel NSE should upgrade their TAME charts to TBT so that it can act as a reference.They might make it more delayed 15 mins or so if they want to.
 
#67
Thanks for the GDFL data.
I could not fully understand what you are saying.
How somebody designs the system is another question. Here we are discussing only datafeed.
If I need weighted average and I would use line chart not candlestick chart.
I was trying to convey the complexity of the point under discussion.

What you want (100% as per your terminology) is not as straight forward as you see it and there are many techno-commercial challenges for vendors as well as traders. As vendor, he has to invest in resources and technology to achieve the last most difficult bit. As trader, we will need to be ready for the price bomb for such top-of-the-line products and services. And above all, there has to be sufficient numbers so that a vendor will be willing to do this.

There is nothing called 99% correct it is either 100% correct or incorrect.
I accept your viewpoint academically but my dear friend, there is nothing like 100% in real world (this is my personal view). A complex system never claims to be 100% correct. Example, even in developed countries, the highest level of SLA for uptime as given by any hosting provider is 99.995% - which means some 2+ minutes of acceptable downtime per month. And it is compensated by giving some discount on your monthly invoice (i.e. they don't compensate your direct or indirect loss).

As we all know, achieving last bit is always very tough. If you want to take an overview of what it takes to go from 99% to 99.995% in IT industry, read this article from the blog of one of the most reputed data centers in India.

There is nothing wrong in expecting more (or 100%) but I feel we should be ready with plans if that is not achieved (plan B). This is to ensure that we do not lose our focus of what we want. Our ultimate aim is to earn livelihood from this stock market - isn't it ? Then instead of fighting with existing system (which costs time and money), I will accept it and find a way out to succeed (again - my personal views).

IMO All data vendors should include disclaimer that their datafeed are inaccurate.

Unless disclaimers are provided class action suite can be brought against NSE.
They should not charge for something which is inaccurate.
Disclaimers are all over there - on their websites as well as in their product installers. Its a different story that we all dont care to read them ;) You will be surprised but even exchanges have clearly worded disclaimers wherein they state that 'accuracy, sequence and completeness of data is not at all guaranteed'.

So... now what ? :)

Also, I failed to understand how does their (disclaimer) presence or absence change my trading ?
 
#68
I strongly feel NSE should upgrade their TAME charts to TBT so that it can act as a reference.They might make it more delayed 15 mins or so if they want to.
Sorry but you have no idea what TBT will bring with it. Data vendors and traders will need significant improvement in their hardware, processing software and internet bandwidth to handle such enormous data. It will create a host of new problems I am sure. See this post of GFDL on another forum to know some facts about TBT data.

It is easy to get behind the sampling excuse and say they will all be different.
I think he is trying to convey the same point - complexity and enormity of the issue.
Sure you have a requirement but unless you are ready to understand the problems and issues which are at the core, you won't be able to achieve what you want, my dear friend !!
 

TradeOptions

Well-Known Member
#69
@ChartAnalyst, your posts have been Extremely Good. Thanks for sharing this info brother. That makes complete sense to me. You seem to have some good information about the kind of stuff that happens at the Back-end of exchanges and data vendors. Please continue to post your views on this.

We as traders are responsible for taking the precautionary measure in our Trading Game Plan itself, so that a difference of a few ticks do not cause severe damage to our capital.

Thanks and regards
 

TradeOptions

Well-Known Member
#70
I disagree with fxgood. Indian exchanges data services are not sub-standard. Yes, customer service and ethics are at times lacking but infrastructure in India is OK.

I've written code for data sampling (cannot go too much into detail but can offer this)

Essentially at heart of all sampling logic, there will be a time interval (in few hundred mili seconds) and usually this is a configurable parameter. Lets call this t

At start T, price is captured (first price moves to O, H L and C, volume is set to v)

At T + t, price is captured (C is set, High and low are set if necessary, volume is set to differential between current and earlier) and so on!!!

Over time, output of data at sampling rate of 100 milliseconds could be different from sampling at 101 milliseconds for very fast moving ticks.

There are 22500 seconds in NSE's trading session so that sets upper limit of number of ticks at 1 second at 22500.

With sampling rate of at 100 milliseconds, potential data size of 2,25,000 rows. If you consider average row size of 100 bytes. its 2MB worth of data per scrip per day.

We capture 250 scrips so it gobble up 500MB+ per day. Hard drive is cheap but not that cheap!!!

Comparing data feeds by ticks in excel will only muddle the issue and lead to sleepless nights. You'll never be able to arrive at a level to confidently grade a Vendor.
Thank you for sharing this information mastermind :thumb:

500 MB+ just for 250 scrips ! That would become big load for normal retail traders for sure. Although for big guys who have leased lines and dedicated servers, it would not be a big deal at all. So such kind of data can be accessed by them easily.

Thanks and regards
 

Similar threads