Datafeed Vendor Comparison

yusi

Well-Known Member
Some observations on the NIFTY-I 'tick' data of 2019.02.01 from GDFL and TrueData provided by @TradeOptions and @bpr :

Content

- TrueData has Date, Time, LTP, LTQ columns. Contains trade records only, with count: 18019.
- GDFL additionally has BuyPrice, BuyQty, SellPrice, SellQty, Open Interest. As a result, rows with 0 LTQ exist. Total records: 21824. Count of trade records: 18004.

A visual track of various kinds of differences is shown. While a count of other individual differences (total 2664 groups; a group may be 1 or more rows) has not been done, they are highlit below. In the images, GDFL is to the left, TrueData is to the right.

TimeStamp

The illustrated difference are Time jitter (very few) and duplicate time stamps (only in TD). The jitter difference shows Time as 09:23:22 in GDFL and 09:23:21 in TD with the other time record missing in both. TD has quite a few duplicate time-stamps; I guess that in AmiBroker the last duplicate would overwrite the previous resulting in lost data.



Price (LTP)

These differences tend to occur in clusters, typically near high volumes. Perhaps, the highest one noticed wa at 09:18:14 where GDFL shows 10887.55 and TD shows 10894.00. Generally, the differences are few and small.



Volume (LTQ)

Surprisingly, no differences were noticed here. Except for the spurious row in TD with a LTQ of 1 at open, zero differences come to notice in individual records.

Day Open / High / Low / Close

NSE data has the OHL values at 10870.00, 11023.20, 10835.10.

TD has messy open data. Strangely, import to Ami, shows a value of 10870 (volume 1). For all other duplicate timestamps, the price/volume of the later quote is taken. This multiple time-stamp open data occurs in TD's BankNifty data as well.



The Day High is captured by TD at 13:04:38. The corresponding price in GDFL is 11020.10.



The Day Low of 10835.10 at 14:32:24 is again captured by TD but not by GDFL.



The Day LTP of 10918.00 is captured by both GDFL and TD.

Minor High/Lows

Similar clusters of mismatches tend to happen at minor highs and lows. An example a minor low at 14:10:40s is shown below. It is difficult to judge which value is more accurate.



Total contracts traded for day

NSE shows 250,728
GDFL shows 250,705 (250,728 in M1 data)
TD shows 250728.


@bpr : could you please check how open and duplicate time-stamp records reflect in your AmiBroker, comparing live feed and eod 'tick' data.
 

bpr

Well-Known Member
@yusi good analysis .. I have not used ami for a while ...give me some time ....

I quickly checked in multicharts which is what I am using now

For nifty the duplicate timestamp 09:23:23 both points are shown, the software kind of auto generates its own timestamps different from the data ..
but all data points are present e.g ...
09:23:22.793 10885 1125
09:23:23.769 10884.65 525
09:23:24.058 10884.45 75

The open is 10870. Both phantom record at open is shown as

09:15:00.128 10870 1
09.15.00.129 10870 1
 
Last edited:

bpr

Well-Known Member
Ok now the same thing for AMI ...I see Ami has no issues displaying duplicate time stamp
timestamp are exactly as the data

NIFTY open
09:15:00 10870 1
09:15:00 10870 1

09:23:21 10885 1125
09:23:23 10884.7 525
09:23:23 10884.5 75

Interesting to note the price has been rounded single decimal instead of 2 ..I don't know if there is any setting to avoid that

I did try to scan real time data for few minutes for duplicate timestamp could not find any ....not sure how frequent are those..but I think it should work like historical data
 

yusi

Well-Known Member
@bpr: Thanks. I now realize that $TICKMODE was not set.

Not sure why price is rounding in your case.

In the sample file that you gave, 09:17:xx has 4 occurrences, 12:59:00 has 0, 13:16:xx has 10 duplicate time-stamps to take a random sample. Typically, the neighbouring second record will be missing. Yes, should not be an issue.
 

bpr

Well-Known Member
@bpr: Thanks. I now realize that $TICKMODE was not set.

Not sure why price is rounding in your case.

In the sample file that you gave, 09:17:xx has 4 occurrences, 12:59:00 has 0, 13:16:xx has 10 duplicate time-stamps to take a random sample. Typically, the neighbouring second record will be missing. Yes, should not be an issue.
So any conclusion for this whole exercise ...is TD better than GDFL ??
Are we saying these are even close to the TBT ? (I completely disagree with this one)
Also the phantom record with wrong volume at start in TD is capturing the correct open price they are there for a reason then ...??
 

cloudTrader

Well-Known Member
@yusi good analysis .. I have not used ami for a while ...give me some time ....

I quickly checked in multicharts which is what I am using now

For nifty the duplicate timestamp 09:23:23 both points are shown, the software kind of auto generates its own timestamps different from the data ..
but all data points are present e.g ...
09:23:22.793 10885 1125
09:23:23.769 10884.65 525
09:23:24.058 10884.45 75

The open is 10870. Both phantom record at open is shown as

09:15:00.128 10870 1
09.15.00.129 10870 1
@bpr,

As you are using Multi Charts so I wanted to ask you that can you point out the best feature which made you decide about using it in place of Amibroker ?
 

yusi

Well-Known Member
So any conclusion for this whole exercise ...is TD better than GDFL ??
Are we saying these are even close to the TBT ? (I completely disagree with this one)
Also the phantom record with wrong volume at start in TD is capturing the correct open price they are there for a reason then ...??
Do not want to get into a position of judging between TrueData and GDFL. This was just a little weekend time spent with one data set. Besides F&O, the equity offering needs a look as well. Neither is better than 1 second data; the vendors seems to have subsumed 'tick' to mean the same as the ticking of a wall-clock. Both are the among the top Indian data providers, and have been in business 5+ years. In terms of data accuracy, TD seems to have a slight edge in the capturing of high and lows; this would affect trendlines, levels, envelopes in charts. GDFL has no reason to be missing the Day high and low values given that this can be captured by terminal based utilities as well. TD needs to work on its time-stamps, not that they largely affect M5 or M1 or even 13-tick charts. You should write to them as to why their open data quotes are the way they are; cannot figure the reason.

Besides data quality, the quality of plug-ins / support / uptime affect the vendor decision.

Treat it more like a curiosity exercise for the differences between near equals.
 

Similar threads