Reflection
I have encountered many challenges in this project; as mentioned before, data collection is one of the biggest challenges.
Challenges
Another challenge is to read the CSV files in Python. For example, when Python calculates the portfolio variance of a two-stock portfolio (eg. stock A and stock B), the Python program needs to get the values of the stock variance of A, the stock variance of B, and the correlation coefficient between A and B. To do these, I wrote a Python program that reads into the WeeklyReturn.csv and CorrelationMatrix.csv and identifies the stock variance (or correlation coefficient) for a given stock (or two given stocks for correlation). The following code would better illustrate how my Python programs gets the right correlation coefficient from a matrix presented in a CSV file:
myFile = open('CorrelationMatrix.csv','r')
myString = myFile.read() |First I read the entire correlative matrix as a string
myList = re.split('\n', myString) | Break the string down into a list of strings, splitting on every row
for i in range(0,len(myList)-1): | Use a for loop to loop through every string in the list
myList[i] = re.split(',',myList[i]) | Turn the list of strings into a list of lists
stocksList = myList[0] | Get a list of stocks (the rows and column of the matrix is arranged in the order of this list)
numStockA = stocksList.index(stockA) | Find the position of stock A in the stocks list
numStockB = stocksList.index(stockB) | Find the position of stock B in the stocks list
correlation = float(myList[numStockA][numStockB]) | Index into the list of lists using the two positions above and get the right correlation coefficient
The above code works very successfully, and my program successfully returns the variance-minimizing strategy for each pairs of stock.
Another challenge is to come up with a method to evaluate the effectiveness of the variance-minimizing strategy. As mentioned before, looking at the return is not a good way to evaluate the variance-minimizing strategy, because idea of minimizing portfolio variance is to protect investors against significant downside loss. Therefore I came up with the idea to compare the variance-minimizing returns to the maximum returns and minimum returns over three periods of time. Difference between variance-minimizing returns and maximum returns represents the costs of implementing the variance-minimizing strategy, while the difference between variance-minimizing returns and the minimum return represents the protection against significant downside loss (ie. the benefits of implementing the variance-minimizing strategy.) Adding these two differences together gives me an indicator to gauge whether the variance-minimizing strategy is effective.
Further steps
To further test the hypothesis and evaluate the variance-minimizing strategy, the stock variances can be calculated with data over longer or shorter timeframes. For example, in the analysis that I conducted, the stock variances are calculated based the weekly return from 4/24/2009 to 4/27/2012; these stock variances can be re-calculated based on the weekly returns in the past year only or based only the weekly returns in the past five years. By observing different timeframes, we can have a better idea of how effective the variance-minimizing strategy is.
Another additional analysis we can do is to examine the stocks’ expected returns as well, using the CAPM formula (but understanding this formula would require a background in finance): ra = rf + Ba(rm-rf). After calculating the expected return of a two-stock portfolio, we can plot a graph with y-axis being portfolio return and x-axis being portfolio variance. This graph would give us a better visualization of the trade-off between variance and expected return for a given two-stock portfolio.
I have encountered many challenges in this project; as mentioned before, data collection is one of the biggest challenges.
Challenges
Another challenge is to read the CSV files in Python. For example, when Python calculates the portfolio variance of a two-stock portfolio (eg. stock A and stock B), the Python program needs to get the values of the stock variance of A, the stock variance of B, and the correlation coefficient between A and B. To do these, I wrote a Python program that reads into the WeeklyReturn.csv and CorrelationMatrix.csv and identifies the stock variance (or correlation coefficient) for a given stock (or two given stocks for correlation). The following code would better illustrate how my Python programs gets the right correlation coefficient from a matrix presented in a CSV file:
myFile = open('CorrelationMatrix.csv','r')
myString = myFile.read() |First I read the entire correlative matrix as a string
myList = re.split('\n', myString) | Break the string down into a list of strings, splitting on every row
for i in range(0,len(myList)-1): | Use a for loop to loop through every string in the list
myList[i] = re.split(',',myList[i]) | Turn the list of strings into a list of lists
stocksList = myList[0] | Get a list of stocks (the rows and column of the matrix is arranged in the order of this list)
numStockA = stocksList.index(stockA) | Find the position of stock A in the stocks list
numStockB = stocksList.index(stockB) | Find the position of stock B in the stocks list
correlation = float(myList[numStockA][numStockB]) | Index into the list of lists using the two positions above and get the right correlation coefficient
The above code works very successfully, and my program successfully returns the variance-minimizing strategy for each pairs of stock.
Another challenge is to come up with a method to evaluate the effectiveness of the variance-minimizing strategy. As mentioned before, looking at the return is not a good way to evaluate the variance-minimizing strategy, because idea of minimizing portfolio variance is to protect investors against significant downside loss. Therefore I came up with the idea to compare the variance-minimizing returns to the maximum returns and minimum returns over three periods of time. Difference between variance-minimizing returns and maximum returns represents the costs of implementing the variance-minimizing strategy, while the difference between variance-minimizing returns and the minimum return represents the protection against significant downside loss (ie. the benefits of implementing the variance-minimizing strategy.) Adding these two differences together gives me an indicator to gauge whether the variance-minimizing strategy is effective.
Further steps
To further test the hypothesis and evaluate the variance-minimizing strategy, the stock variances can be calculated with data over longer or shorter timeframes. For example, in the analysis that I conducted, the stock variances are calculated based the weekly return from 4/24/2009 to 4/27/2012; these stock variances can be re-calculated based on the weekly returns in the past year only or based only the weekly returns in the past five years. By observing different timeframes, we can have a better idea of how effective the variance-minimizing strategy is.
Another additional analysis we can do is to examine the stocks’ expected returns as well, using the CAPM formula (but understanding this formula would require a background in finance): ra = rf + Ba(rm-rf). After calculating the expected return of a two-stock portfolio, we can plot a graph with y-axis being portfolio return and x-axis being portfolio variance. This graph would give us a better visualization of the trade-off between variance and expected return for a given two-stock portfolio.