Notebook

RVOL - 20 day Relative Volume Calculaterer

In [260]:
"""
First, if you don't know what RVOL is: it is used to compare the accumulated volume at a specific point in time 
on the day of interest to the average acuumulated volume by the same point over the past 20 days. 
Why not just make an SMA of volume and compare your current volume to that? 
RVOL indicates changes in participation minute by minute. For example: Comparing 5 minutes into the 
open when accumulated volume is 100K on an average day, and on the day in question it is 1M (RVOL would be 10).
RVOL would give you a better idea of a change in the participation character of the stock in question. 
If the stock normally does 5M in average volume in a day, the 900K change by that minute might not be picked up 
as significant with an SMA because it is still below average. By the end of the day, volume might be 15M or 20M. 

For any given minute in a day, if RVOL is elevated (>1.0 means higher than average, >3.0 means there is an 
abnormal amount of volume), the stock is probably being traded by more participants, which in turn 
might cause it to move more than average (up or down), or range aggressively. 
SMB capital, a propietary trading firm located in uptown Manhattan, who's main speaker is Mike Bellafiore, author of
the trading classics "one good trade" and "the playbook", uses the term "in play" to describe this... 
(if you have ever watched more than one of their videos on youtube, you'll understand that last sentence, it wasn't
an advertising plug for them). When combined with a catalyst like news, its a good benchmark to let you know the 
stock is something you should spend time watching due to the increased volitility that usually comes with the 
combination.

Summary of the code: 
-Create a relative volume calculation based on a symbol and date of interest as inputs. Relative volume
 needs to be instantiated as an object first, and then .calculate(date, "symbol") is called on that object 
 (see the bottom 3 lines for an example)
-Look back through 20 days of minute data 
-Make a rolling sum of the volume data minute to minute (accumulated the volume)
-Make a dataframe of the accumulated volumes over the 20 days, the last column to be made 
 (the day we are interested in) is removed from the dataframe and appended to the output dataframe as 'vol_today'. 
-Average accumulated volumes for each row (row = minute() from timestamps), add into a new  
 column using [dataframe].mean(axis = 1). 
-Remove and appended the column to the output RVOL dataframe as 'avg_vol'. 
-Once that is done, vol_today is divided by avg_vol to give RVOL by minute
-You could change the code to only copy the 'vol_today' if you wanted to continue to roll data into the 
 dataframe over more than 1 day. 
-I am no expert programmer. There are likely issues with the code, I have run into a few. This is the cleanest
 most elegant code I could hack together to make this indicator, sorry. You are welcome to do what you want 
 with it and share it. I will change it as I come across issues, but it is up to you to adapt it to your needs. 
"""

import numpy as np
import pandas as pd
from datetime import time, timedelta, date

class RelativeVolume:
    def __init__(self):
        self.rvol = []
     
    def calculate(self, tdate, symb):
        #-----intialize variables-------------------------------------------------------#        
        self.separated_dates = [] #timestamp date handler
        self.separated_times = [] #timestamp time handler
        
        #20 trading days is standard for RVOL calculation, which is 4 - 5 day weeks when you remove weekends
        self.trading_date = tdate #trading date we are interested in, last day in data
        self.rvol_start =  self.trading_date - timedelta(weeks=4) #rvol starting date
        self.symb = symb #symbol that is passed in
        
        #get minute pricing for the symbol and dates of interest
        pricing = get_pricing(self.symb, start_date=str(self.rvol_start), end_date=str(self.trading_date), frequency='minute')
        #print(pricing) #debug
        
        #-----separate dates and times from timestamps into their own arrays----------#
        for x in range(len(pricing.index)):
            dates = pricing.index[x].date()
            times = pricing.index[x].time()
            if dates not in self.separated_dates:
                self.separated_dates.append(dates)
            if times not in self.separated_times:
                self.separated_times.append(times)
        
        #-----Rolling volume accumulation--------------------------------------------#
        #initialize dataframe used to store accumulated volume values
        #if the day used to build this is a holiday, you may have an issue FYI, I didnt test
        rvol_helper = pd.DataFrame(index = self.separated_times)
        
        #make a rolling sum of the volume by minute for each date
        for x in self.separated_dates:
            currDate = x #date we are working on
            minutes = [] #helper array
            rvolx = 0 #used to accumulate volume
            
            #loop through the pricing data, sum the volumes for the specific day
            #append to the minutes array, which is then used to build the rvol_helper day by day
            for y in range(len(pricing.index)):
                if pricing.index[y].date() == currDate:
                    rvolx += pricing.volume[y]
                    minutes.append(rvolx)
                    #print(currDate)
                else:
                    pass
                
            #create column of data for date to create rvol from
            #will throw an error on the first minute of the day, this is the work around for that
            try:
                #print(minutes) #debug
                rvol_helper[x] = minutes 
            except:
                pass
            else:
                pass
            
        #------Parse data to dataframes for handling--------------------------------#
        #create RVOL dataframe
        self.rvol = pd.DataFrame(index = self.separated_times, columns = ('vol_today', 'avg_vol', 'rvol'))
        #print(rvol_helper) #debug
        
        #remove day of interest volume, add to RVOL dataframe as 'vol_today'
        self.rvol['vol_today'] = rvol_helper.pop(rvol_helper.columns[-1])
        
        #average the 19 days of volume by minute (across rows)
        rvol_helper['avg_vol'] = rvol_helper.mean(axis=1)
        #print(rvol_helper) #debug
        
        #remove the column of average volumes per minute, append to RVOL as 'avg_vol'
        self.rvol['avg_vol'] = rvol_helper.pop(rvol_helper.columns[-1])
        
        #calculate RVOL for the day using today volume and average volume
        self.rvol['rvol'] = self.rvol['vol_today']/self.rvol['avg_vol']
        #print(rvol)#debug
        
        #return the dataframe
        return(self.rvol)

#---------variables to call the class/method------------#
dadate = date(2019,12,24)
dajob = RelativeVolume()
dajob.calculate(dadate, 'MBOT') #good symbol and date example to see why massive RVOL matters
Out[260]:
vol_today avg_vol rvol
14:31:00 28239.0 2536.764706 11.131896
14:32:00 43953.0 2859.000000 15.373557
14:33:00 52254.0 3268.882353 15.985280
14:34:00 103282.0 3718.176471 27.777595
14:35:00 122980.0 3974.058824 30.945692
14:36:00 144218.0 4556.411765 31.651661
14:37:00 165404.0 5251.058824 31.499171
14:38:00 173103.0 5321.647059 32.528087
14:39:00 185903.0 5618.705882 33.086444
14:40:00 194620.0 6476.352941 30.050864
14:41:00 205320.0 7641.411765 26.869380
14:42:00 214797.0 8422.117647 25.503918
14:43:00 215897.0 9023.000000 23.927408
14:44:00 224701.0 9312.411765 24.129195
14:45:00 230890.0 9772.705882 23.626005
14:46:00 234770.0 10957.000000 21.426485
14:47:00 240520.0 11547.647059 20.828486
14:48:00 243970.0 12169.058824 20.048387
14:49:00 248655.0 12480.588235 19.923340
14:50:00 249905.0 13006.823529 19.213377
14:51:00 252455.0 13976.058824 18.063390
14:52:00 258205.0 14319.058824 18.032261
14:53:00 258605.0 14496.705882 17.838880
14:54:00 262255.0 15129.823529 17.333646
14:55:00 268592.0 15423.000000 17.415030
14:56:00 273598.0 16121.823529 16.970661
14:57:00 278092.0 16646.411765 16.705822
14:58:00 286588.0 16669.941176 17.191902
14:59:00 296430.0 16781.705882 17.663878
15:00:00 302892.0 17174.705882 17.635935
... ... ... ...
20:31:00 5127858.0 127356.058824 40.263950
20:32:00 5150464.0 127475.823529 40.403457
20:33:00 5157014.0 127519.941176 40.440844
20:34:00 5160864.0 128135.764706 40.276530
20:35:00 5181824.0 128265.176471 40.399305
20:36:00 5185927.0 128577.941176 40.332945
20:37:00 5195083.0 128803.941176 40.333261
20:38:00 5202922.0 128902.647059 40.363190
20:39:00 5206472.0 129040.823529 40.347480
20:40:00 5207072.0 129273.176471 40.279601
20:41:00 5213622.0 129413.000000 40.286695
20:42:00 5289036.0 129755.941176 40.761417
20:43:00 5380407.0 131252.117647 40.992916
20:44:00 5467564.0 132465.764706 41.275299
20:45:00 5516563.0 135516.176471 40.707782
20:46:00 5532940.0 136887.411765 40.419641
20:47:00 5554503.0 137906.882353 40.277199
20:48:00 5571064.0 138115.764706 40.336192
20:49:00 5590543.0 138615.117647 40.331409
20:50:00 5600470.0 139008.705882 40.288628
20:51:00 5613862.0 139315.882353 40.295923
20:52:00 5633511.0 140288.764706 40.156537
20:53:00 5723477.0 140954.588235 40.605113
20:54:00 5772675.0 141330.294118 40.845277
20:55:00 5793569.0 141489.117647 40.947100
20:56:00 5838033.0 142321.470588 41.020044
20:57:00 5854047.0 142681.176471 41.028867
20:58:00 5873619.0 143251.235294 41.002222
20:59:00 5929333.0 144053.529412 41.160623
21:00:00 5967429.0 145203.117647 41.097113

390 rows × 3 columns

In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]: