Notebook

RVOL - 20 day Relative Volume Calculaterer

In [14]:
"""
First, if you don't know what RVOL is: it is used to compare the accumulated volume at a specific point in time 
on the day of interest to the average acuumulated volume by the same point over the past 20 days. 
Why not just make an SMA of volume and compare your current volume to that? 
RVOL indicates changes in participation minute by minute. For example: Comparing 5 minutes into the 
open when accumulated volume is 100K on an average day, and on the day in question it is 1M (RVOL would be 10).
RVOL would give you a better idea of a change in the participation character of the stock in question. 
If the stock normally does 5M in average volume in a day, the 900K change by that minute might not be picked up 
as significant with an SMA because it is still below average. By the end of the day, volume might be 15M or 20M. 

For any given minute in a day, if RVOL is elevated (>1.0 means higher than average, >3.0 means there is an 
abnormal amount of volume), the stock is probably being traded by more participants, which in turn 
might cause it to move more than average (up or down), or range aggressively. 
SMB capital, a propietary trading firm located in uptown Manhattan, who's main speaker is Mike Bellafiore, author of
the trading classics "one good trade" and "the playbook", uses the term "in play" to describe this... 
(if you have ever watched more than one of their videos on youtube, you'll understand that last sentence, it wasn't
an advertising plug for them). When combined with a catalyst like news, its a good benchmark to let you know the 
stock is something you should spend time watching due to the increased volitility that usually comes with the 
combination.

Summary of the code: 
-Create a relative volume calculation based on a symbol and date of interest as inputs. Relative volume
 needs to be instantiated as an object first, and then .calculate(date, "symbol") is called on that object 
 (see the bottom 3 lines for an example)
-Look back through 20 days of minute data 
-Make a rolling sum of the volume data minute to minute (accumulated the volume)
-Make a dataframe of the accumulated volumes over the 20 days, the last column to be made 
 (the day we are interested in) is removed from the dataframe and appended to the output dataframe as 'vol_today'. 
-Average accumulated volumes for each row (row = minute() from timestamps), add into a new  
 column using [dataframe].mean(axis = 1). 
-Remove and appended the column to the output RVOL dataframe as 'avg_vol'. 
-Once that is done, vol_today is divided by avg_vol to give RVOL by minute
-You could change the code to only copy the 'vol_today' if you wanted to continue to roll data into the 
 dataframe over more than 1 day. 
-I am no expert programmer. There are likely issues with the code, I have run into a few. This is the cleanest
 most elegant code I could hack together to make this indicator, sorry. You are welcome to do what you want 
 with it and share it. I will change it as I come across issues, but it is up to you to adapt it to your needs.
 
 ***updated to be able to use on local jupyter notebook with something like a .csv file, got rid of NaN values by
 filling in all values with the rolling sums, added a line to be able to use pre/post market, convert times from
 UTC to Eastern US
"""

import numpy as np
import pandas as pd
from datetime import time, timedelta, date

class RelativeVolume:
    def __init__(self):
        self.rvol = []
    
    #use this function to create dates from timestamps
    def dateparse(timestamp):
        timer = pd.to_datetime(timestamp, unit ='s') - pd.Timedelta(hours = 5)
        return timer
     
    def calculate(self, tdate, symb):
        #-----intialize variables-------------------------------------------------------#        
        self.separated_dates = [] #timestamp date handler
        self.separated_times = [] #timestamp time handler
        
        #20 trading days is standard for RVOL calculation, which is 4 - 5 day weeks when you remove weekends
        self.trading_date = tdate #trading date we are interested in, last day in data
        self.rvol_start =  self.trading_date - timedelta(days=20) #rvol starting date
        self.symb = symb #symbol that is passed in
        
        #get minute pricing for the symbol and dates of interest
        pricing = get_pricing(self.symb, start_date=str(self.rvol_start), end_date=str(self.trading_date), frequency='minute')
        
        #converts UTC to US/Eastern
        pricing.index = pricing.index - pd.Timedelta(hours=5)
        
        #-----separate dates and times from timestamps into their own arrays----------#
        for x in pricing.index:
            mydates = x.date()
            if mydates not in self.separated_dates:
                self.separated_dates.append(mydates)
        
        #uncomment the below line if you are using data with pre/post market
        #time_range = pd.date_range(start = str(self.separated_dates[-1]) + ' ' + str(time(4,0)), end = str(self.separated_dates[-1]) + ' ' + str(time(20,0)), freq = "T")
        time_range = pd.date_range(start = str(self.separated_dates[-1]) + ' ' + str(time(9,30)), end = str(self.separated_dates[-1]) + ' ' + str(time(16,0)), freq = "T")
        for y in time_range:
            mytimes = y.time()
            if mytimes not in self.separated_times:
                self.separated_times.append(mytimes)

        
        #print(self.separated_dates)
        #print(self.separated_times)

        #-----Rolling volume accumulation--------------------------------------------#
        #initialize dataframe used to store accumulated volume values
        #if the day used to build this is a holiday, you may have an issue FYI, I didnt test
        rvol_helper = pd.DataFrame(index = self.separated_times, columns = self.separated_dates)
        #print(rvol_helper)

        #make a rolling sum of the volume by minute for each date
        for x in range(len(self.separated_dates)):
            currDate = self.separated_dates[x] #date we are working on
            minutes = [] #helper array
            rvolx = 0 #used to accumulate volume
            #print(currDate)
            #loop through the pricing data, sum the volumes for the specific day
            #append to the minutes array, which is then used to build the rvol_helper day by day
            for y in range(len(self.separated_times)):
                time_now = self.separated_times[y]
                date_check = pd.Timestamp.combine(currDate, time_now)
                try:
                    rvolInt = pricing['volume'].loc[date_check]
                except:
                    rvolInt = rvolx
                else: 
                    rvolx += rvolInt
                
                rvol_helper.at[time_now, currDate] = rvolx

              
        #------Parse data to dataframes for handling--------------------------------#
        #create RVOL dataframe
        self.rvol = pd.DataFrame(index = self.separated_times, columns = ('vol_today', 'avg_vol', 'rvol'))
        #print(rvol_helper) #debug
        
        #remove day of interest volume, add to RVOL dataframe as 'vol_today'
        self.rvol['vol_today'] = rvol_helper.pop(rvol_helper.columns[-1])
        
        #average the 19 days of volume by minute (across rows)
        rvol_helper['avg_vol'] = rvol_helper.mean(axis=1)
        #print(rvol_helper) #debug
        
        #remove the column of average volumes per minute, append to RVOL as 'avg_vol'
        self.rvol['avg_vol'] = rvol_helper.pop(rvol_helper.columns[-1])
        
        #calculate RVOL for the day using today volume and average volume
        for x in self.rvol.index:
            volTod = self.rvol['vol_today'].loc[x]
            avgVol = self.rvol['avg_vol'].loc[x]
            if volTod > 0 and avgVol > 0:
                self.rvol.at[x, 'rvol'] = volTod/avgVol
            else:
                self.rvol.at[x, 'rvol'] = 0
       
        return(self.rvol)

#---------variables to call the class/method------------#
dadate = date(2019,12,24)
dajob = RelativeVolume()
outputs = dajob.calculate(dadate, 'MBOT') #good symbol and date example to see why massive RVOL matters
print(outputs)
            vol_today        avg_vol     rvol
09:30:00            0       0.000000        0
09:31:00       343648    4923.285714  69.8005
09:32:00       447008    6405.928571  69.7804
09:33:00       537217    7446.571429  72.1429
09:34:00       646290   11637.000000  55.5375
09:35:00       831077   13340.428571  62.2976
09:36:00  1.17107e+06   15557.428571  75.2739
09:37:00  1.79596e+06   17914.214286  100.253
09:38:00  2.12366e+06   18549.857143  114.484
09:39:00  2.31845e+06   19824.857143  116.947
09:40:00  2.77747e+06   21488.928571  129.251
09:41:00  2.93821e+06   23653.642857  124.218
09:42:00  3.07304e+06   25250.000000  121.705
09:43:00  3.31698e+06   25994.500000  127.603
09:44:00  3.99378e+06   26967.000000  148.099
09:45:00  4.42318e+06   27968.000000  158.151
09:46:00  4.79563e+06   29683.214286   161.56
09:47:00  5.00895e+06   30811.142857  162.569
09:48:00  5.40892e+06   31812.142857  170.027
09:49:00  5.54614e+06   32525.071429  170.519
09:50:00  5.83742e+06   33253.357143  175.544
09:51:00  6.24339e+06   34612.428571   180.38
09:52:00  6.67343e+06   35368.214286  188.684
09:53:00  7.03187e+06   35612.500000  197.455
09:54:00  7.22378e+06   36642.000000  197.145
09:55:00  7.36751e+06   37450.642857  196.726
09:56:00  7.52873e+06   38628.214286  194.902
09:57:00  7.66577e+06   39586.214286  193.647
09:58:00  8.19024e+06   40221.642857  203.628
09:59:00  8.57363e+06   41060.357143  208.806
...               ...            ...      ...
15:31:00  2.07192e+07  516437.928571  40.1194
15:32:00  2.07192e+07  518198.071429  39.9831
15:33:00  2.07192e+07  518719.500000  39.9429
15:34:00  2.07192e+07  519735.142857  39.8648
15:35:00  2.07192e+07  521389.428571  39.7384
15:36:00  2.07192e+07  522062.285714  39.6871
15:37:00  2.07192e+07  522744.857143  39.6353
15:38:00  2.07192e+07  523424.642857  39.5838
15:39:00  2.07192e+07  523846.000000   39.552
15:40:00  2.07192e+07  524171.000000  39.5275
15:41:00  2.07192e+07  524808.642857  39.4795
15:42:00  2.07192e+07  530611.785714  39.0477
15:43:00  2.07192e+07  538947.928571  38.4437
15:44:00  2.07192e+07  546611.428571  37.9047
15:45:00  2.07192e+07  553815.428571  37.4117
15:46:00  2.07192e+07  556650.285714  37.2211
15:47:00  2.07192e+07  559428.428571  37.0363
15:48:00  2.07192e+07  560865.000000  36.9414
15:49:00  2.07192e+07  562848.428571  36.8113
15:50:00  2.07192e+07  564035.428571  36.7338
15:51:00  2.07192e+07  565365.000000  36.6474
15:52:00  2.07192e+07  567933.142857  36.4817
15:53:00  2.07192e+07  575167.785714  36.0228
15:54:00  2.07192e+07  579138.142857  35.7759
15:55:00  2.07192e+07  580823.428571   35.672
15:56:00  2.07192e+07  584993.000000  35.4178
15:57:00  2.07192e+07  586573.642857  35.3224
15:58:00  2.07192e+07  588663.857143  35.1969
15:59:00  2.07192e+07  593617.642857  34.9032
16:00:00  2.07192e+07  597703.642857  34.6646

[391 rows x 3 columns]
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]: