/********************************************************************************************
DESCRIPTION,
TERMS OF USE, AND PREVIEW OF THE RESULTS
-----------------------------------------------------
This program reproduces
the key result documented in the paper, "Managerial Activeness and Mutual Fund
Performance".
If you use the program in any way, I ask that you please cite
the paper in your work:
Doshi, Hitesh,
Redouane Elkamhi, and Mikhail Simutin, 2015, Managerial Activeness and
Mutual Fund
Performance, Review of Asset Pricing Studies, 5 (2), 156-184.
For every each fund/portfolio discolosure
date, this program calculates
active weight (AW), the fraction of a fund's portfolio holdings that differs
from the
value-weighted
index of these holdings. It then shows that equity mutual fund managers who
exhibit higher
degrees of managerial activeness as proxied for by high active weight
significantly
outperform passive managers.
This program as
a whole or in part should be helpful to many wrestling with mutual funds
data on WRDS. It
can be used to replicate the results of our paper and (with slight changes)
the results of
many other papers that use similar data. Specifically, parts 1, 2, 3, and 5
should be of use
in many other applications.
There are five
parts to the program:
Part 1
constructs a dataset that contains details on portfolio holdings of mutual
funds,
Part 2
identifies diversified domestic actively managed equity fund sample,
Part 3
constructs a dataset collecting returns of the funds
Part 4 computes
active weight, AW, and
Part 5 is a
macro (function) that assigns funds into portfolios (number of portfolios,
e.g.,
deciles, quintiles can be specified by
the user) on the basis of AW and computes
future raw and risk-adjusted returns
(net or gross as specified by the user) for each
portfolio and for the High-Low portfolio
I tried to keep
the program as simple as possible and provide as many comments as possible
to help those
who either want to reproduce the results or to those interested in learning
about linking
and analyzing different datasets used in mutual fund research. Unlike in the
paper, I do not
impose some additional restrictions (e.g., here I do not control for the
incubator bias)
just to keep things simpler. The sample period here also ends in 2013 rather
than in 2012.
Hence the numbers differ slightly relative to those in the paper.
To run this
program on WRDS, you need to have subscriptions to Thomson Financial, MFLINK,
Fama and French,
and CRSP mutual fund and monthly stock datasets. Specifically, the program
uses the
following SAS datasets: tfn.s12type1, tfn.s12type3, crsp.msenames, crsp.msf,
crsp.fund_style,
crsp.fund_names, crsp.monthly_tna_ret_nav, crsp.fund_fees,
mfl.mflink1,
mfl.mflink2, and ff.factors_monthly.
To execute the
program, just copy this code into a 'aw.sas' file and upload the file
onto you home
(or other) directory on WRDS server and run the following command:
qsas aw.sas
It will take a
few minutes to execute and will produce output in 'aw.lst' file in the same
folder as
'aw.sas' Here, for example, are the Carhart alphas (in percent per year) and
loadings of
decile portfolios computed (as of today, Sept 23, 2016) using gross returns:
Portfolio CarhartAlpha Market Value
Size
Momentum
Low -1.48** 1.03***
-0.14*** 0.15***
0.01
2 -0.50
1.00***
-0.12*** 0.20***
0.01
3 0.06 1.03***
-0.07*** 0.19***
0.03***
4 -0.28
1.02*** -0.07*** 0.14***
0.04***
5 0.15 1.01***
-0.12*** 0.11***
0.01
6 0.28 1.01***
-0.11*** 0.08***
0.01
7 0.45 1.00***
-0.07*** 0.10***
0.02**
8 0.48 1.00***
-0.12*** 0.14***
0.01*
9 1.11**
0.98***
-0.09*** 0.21***
0.00
High 1.85***
0.94***
-0.08*** 0.32***
-0.01
High - Low 3.33***
-0.09***
0.05**
0.17***
-0.02
t-statistic [3.97] [-5.25]
[2.08]
[7.43]
[-1.54]
As an aside,
WRDS occasionally changes datasets like it did just a short while ago with
MFLINKS files,
so if you get an error, this is the reason. Look at the log file for details.
Please feel free
to contact me if you have any questions, Mikhail.Simutin@rotman.utoronto.ca
********************************************************************************************/
/* Dates are not
perfectly aligned in different WRDS databases. E.g., CRSP monthly files
often define date as the last trading day of
the month whereas some other datasets
may use the actual last day of the month.
This is a little handy way to define a month dummy
that
sequentially numbers months. md = 1 for December 1925 (first month in CRSP) and
md = 1045 corresponds to December 2012. It's
useful in applications below. */
%let
md = (year(date)*12 + month(date) - 1925*12 - 11);
/********************************************************************************************
PART 1: create a dataset containing the
following variables:
wficn - fund identifier
md - month dummy indexing months
permno - stock identifier
D - dollars invested in a the 'permno'
stock
********************************************************************************************/
/* Thomson
Financial s12type1 file reports two dates: RDATE and FDATE. As of this writing,
out of 1133304 observations in that file,
only 242739 (21%) had the two dates equal.
RDATE is the date as of which the positions
are held by the fund. FDATE is a 'vintage date'
and is used as a key for joining some
databases. Not sure why it is called 'F'DATE, but
it has nothing to do with a filing date for
our intents and purposes. Check out
http://goo.gl/lMI3l (esp. page 5) for
details (you have to be a WRDS member).
As an example, try pulling fundno = 1099
(ROYCE HERITAGE FUND). You will see that for
many RDATEs (e.g., RDATE = 30JUN1996), there
are multiple FDATEs. We'll use this fund later
in another example.
For a particular FUNDNO-RDATE pair, we now
pick the first FDATE because "when a fund's
holdings are not reported in the next quarter
or if the holdings data for a fund fail
to be collected for any reason, both RDATE
and holdings data in other tables are carried
forward to the next FDATE set, creating
cases of 'stale' data." Using the first available
FDATE for a given FUNDNO-RDATE avoids the
staleness. We also exclude funds with
investment objective codes (IOC) of 1,5,6,
and 7: International, Municipal Bonds,
Bond & Preferred, and Balanced,
respectively, just to keep the size of the file down. */
proc
sort data
= tfn.s12type1 (keep=fundno rdate fdate ioc assets
where=(ioc not in (1,5,6,7)))
out = holdings1 (drop=ioc);
by
fundno rdate fdate;
run;
/* there are
several funds that report more than once in a given month; for those funds
we keep only the last report of the month
(e.g., fundno = 72600 in Nov 1997) */
data
holdings1; set holdings1;
by
fundno rdate;
if
first.rdate;
date = rdate;
md = &md;
drop
date;
run;
/* 'md' is month
dummy that corresponds to the month of RDATE. Note that FUNDNO-FDATE and
FUNDNO-RDATE are unique identifiers (keys) */
data
holdings1; set holdings1;
by
fundno md;
if
last.md;
run;
/* MFLINKS file
was recently updated to allow more direct linking from Thomson Financial
Note that fundno+fdate+rdate is a unique
identifier (key) in both mfl.mflink2 and hodlings1
If you only have access to the old (pre-June
2013) version of mfl.mflink2 file, then replace
the following (uncommented) statement that
creates table holdings2 with these ones:
proc sql;
create table mflink as select wficn,
fundno, sdate1,
case when sdate2 = max(sdate2) then
mdy(12,31,2012) else sdate2 end as sdate2 format date9.
from mfl.mflink2;
create table holdings2 as select
b.wficn, a.*
from holdings1 as a, mflink as b
where a.fundno = b.fundno and fdate
between sdate1 and sdate2
order by wficn, rdate, assets;
drop table mflink;
quit;
Note that the previous version of
mfl.mflink2 file stops in June of 2010. The above
statements assume here that the links that
are valid at that time remain valid through
the end of 2012. This is likely accurate for
the vast majority of the funds. This assumption
can be improved upon (tediously) by using
portfolio identifiers from CRSP (crsp_cl_grp),
which are available starting in 1998. */
proc
sql;
create
table holdings2 (where=(wficn
ne .)) as
select b.wficn, a.*
from
holdings1 as a, mfl.mflink2 as
b
where
a.fundno = b.fundno and a.fdate = b.fdate and
a.rdate = b.rdate
order
by wficn, rdate, assets;
quit;
/* there are
several instances when wficn-rdate is not unique; in those cases, pick the
fund with largest assets */
data
holdings2; set holdings2;
by
wficn rdate;
if
last.rdate;
run;
/* add holdings from
the s12type3 dataset: we don't need FUNDNO after this, so drop it */
proc
sql;
create
table holdings3
as
select wficn, rdate, md, a.fdate, assets, cusip, shares
from
holdings2 as a, tfn.s12type3 as
b
where
a.fdate = b.fdate and a.fundno = b.fundno;
quit;
/* link
historical CUSIP to CRSP to get unique stock identifier PERMNO */
proc
sql;
create
table holdings4
as
select wficn, rdate, md, fdate, assets, permno, shares
from
holdings3 as a, (select
distinct ncusip, permno from
crsp.msenames
where
not missing(ncusip)) as
b
where
a.cusip = b.ncusip;
quit;
/* one last
thing to do is to adjust shares held by the fund for stock splits, etc. The
number
of shares is reported as of FDATE but the
effective holding date is RDATE. If a particular
stock (PERMNO) had some event such as a
stock split which affected the number of shares
outstanding, the reported number of shares
needs to be adjusted.
As an example, let's continue with fundno =
1099 (ROYCE HERITAGE FUND) reference above. Its
wficn is 105794, and its SEC's Edgar filings
are here: http://goo.gl/t5mAi
On RDATE=March 31 2010, the fund held permno
= 76489 (LUFKIN INDUSTRIES), and the number
of shares reported in s12type3 file is
43,400. However, if we dig up the actual SEC filing
for that quarter (http://goo.gl/yTkHb),
we'll see that the fund actually held only 21,700
shares which amounted to an investment of
$1,717,555. Why the difference? It's because
Lufkin Industries had a 2:1 split in June
2010, and the number of shares in s12type3 is
already adjusted for the split. So we need
to re-adjust it back before we compute dollar
investments for every fund-month-stock
observation */
proc
sql;
create
table holdings5
as
select a.*, b.cfacshr as
cfacshr_fdate
from
holdings4 as a, crsp.msf (keep=permno date
cfacshr) as b
where
a.permno = b.permno and year(fdate) =
year(date) and month(fdate) =
month(date);
create
table holdings6
as
select a.*,
round(shares*cfacshr_fdate/b.cfacshr,1)
as shares_adj
from
holdings5 as a, crsp.msf (keep=permno date
cfacshr) as b
where
a.permno = b.permno and year(rdate) =
year(date) and month(rdate) =
month(date);
create
table holdings7 (where=(D
> 0)) as
select wficn, md, a.permno,
case when
shares_adj*abs(prc) > 0 then
shares_adj*abs(prc) else shares*abs(prc) end as
D
from
holdings6 as a, crsp.msf (keep=permno date prc)
as b
where
a.permno = b.permno and year(rdate) =
year(date) and month(rdate) =
month(date);
quit;
/* we can
confirm that holdings of ROYCE HERITAGE FUND (wficn=105794) in March 2010
(md=1012)
of LUFKIN INDUSTRIES stock (PEMRNO=76489) is
indeed the correct $1,717,555 */
data
_null_; set
holdings7;
if
wficn = 105794 and md = 1012
and permno = 76489;
format
D comma12.; put
D;
run;
/* delete
several duplicates to obtain the final holdings file we wanted */
proc
sort data
= holdings7 out = holdings nodupkey;
by wficn md permno; run;
/* delete
unneeded datasets */
proc
datasets nowarn
nolist nodetails;
delete
holdings1 - holdings7 mflink;
run;
quit;
/********************************************************************************************
PART 2: create a dataset that links
CRSP_FUNDNO share class identifiers from CRSP to
WFICN fund identifier from MFLINK and keep
only equity funds. The dataset contains
WFICN (fund identifier) and CRSP_FUNDNO
(share class identifier). There are other ways
of doing this, and you can certainly use
permutations of this. Here, we make use of the
recently introduced investment style
variable from CRSP (crsp_obj_cd) and also use the
longer approach that relies on other style
identifiers. At the end of the day, the two
approaches give virtually identical
rankings, and we make use of both here
********************************************************************************************/
/* different
investment objective codes cover different time period. These codes are
assigned
by various institutions and cover a range of
time periods and funds: Strategic Insight
objective codes are available between 1992
and 1999, Lipper classifications are present
from the end of 1999 through 2008, Wiescat
codes begin in early 2008, and Wiesenberger
codes run through 1993. */
/* CRSP policies
to exclude: note that it is available through 1990 only */
%let
crsp_policy_to_exclude='C & I','Bal','Bonds','Pfd','B &
P','GS','MM','TFM';
/* lipper
classes to include */
%let
lipper_class=('EIEI','G','LCCE','LCGE','LCVE','MCCE','MCGE','MCVE',
'MLCE','MLGE','MLVE','SCCE','SCGE','SCVE');
%let
lipper_obj_cd=('CA','EI','G','GI','MC','MR','SG');
/* strategic
insight classes to include */
%let
si_obj_cd=('AGG','GMC','GRI','GRO','ING','SCG');
/* weisenberger
classes to include */
%let
wbrger_obj_cd=('G','GCI','IEQ','LTG','MCG','SCG');
/* retain latest
style for every crsp_fundno */
data
fund_style; set crsp.fund_style;
format
lipper_classX $4.lipper_obj_cdX $3.
si_obj_cdX $3. wbrger_obj_cdX $5.;
by
crsp_fundno;
%macro
temp (var=);
retain &var.X;
if first.crsp_fundno then &var.X
= &var;
else if &var ne ''
then &var.X = &var;
%mend
temp;
%temp(var=lipper_class);
%temp(var=lipper_obj_cd);
%temp(var=si_obj_cd);
%temp(var=wbrger_obj_cd);
run;
data
latest_style (rename=(lipper_classX=lipper_class lipper_obj_cdX=lipper_obj_cd
si_obj_cdX=si_obj_cd
wbrger_obj_cdX=wbrger_obj_cd)); set
fund_style;
drop
lipper_class lipper_obj_cd si_obj_cd wbrger_obj_cd;
by
crsp_fundno;
if
last.crsp_fundno;
keep
crsp_fundno lipper_classX lipper_obj_cdX si_obj_cdX wbrger_obj_cdX crsp_obj_cd;
run;
/* keep most
recent name of every share class */
data
names; set crsp.fund_names (where=(fund_name
ne ''));
by
crsp_fundno;
if
last.crsp_fundno;
keep
crsp_fundno fund_name index_fund_flag;
run;
/* domestic
equity mutual funds: this borrows from return_gap.sas sample program on
WRDS, http://goo.gl/DB1iz (you have to be a
WRDS member to access that link) */
data
funds; set fund_style
(keep=crsp_fundno si_obj_cd wbrger_obj_cd policy lipper_class
lipper_obj_cd where=(policy
not in (&crsp_policy_to_exclude)));
long_way = 1;
dummy = 0;
if
lipper_class in &lipper_class or lipper_obj_cd in (&lipper_obj_cd) then
output;
else
if missing(lipper_class)=1
and missing(lipper_obj_cd)=1 and
si_obj_cd in &si_obj_cd then
output;
else
if missing(lipper_class)=1
and missing(lipper_obj_cd)=1 and
missing(si_obj_cd)=1
and wbrger_obj_cd in
&wbrger_obj_cd then output;
/* note that these are
unnecessary: for all but seven funds, crsp_fundnos are
identified as equity funds without these statements, and none of the
seven
funds that have all dummy = 1 (and no dummy = 0) cases is a domestic
diversified equity funds, so actually that merging to get avrcs is not
needed */
/* else if
missing(lipper_class)=1 and missing(lipper_obj_cd)=1 and missing(si_obj_cd)=1
and missing(wbrger_obj_cd)=1 and
policy='CS' then do;
dummy = 1; output;end;
else if missing(lipper_class)=1 and
missing(lipper_obj_cd)=1 and missing(si_obj_cd)=1
and missing(wbrger_obj_cd)=1 and
missing(policy)=1 and 80<=avrcs<=105 then do;
dummy = 1; output; end; */
keep
crsp_fundno long_way;
run;
proc
sort data
= funds nodupkey; by
crsp_fundno; run;
/* shorter way
of identifying objectives using crsp_obj_cd variable */
data
funds2; set crsp.fund_style;
if
substr(crsp_obj_cd,1,1)
= 'E'; /* equity */
if
substr(crsp_obj_cd,2,1)
= 'D'; /* domestic */
if
substr(crsp_obj_cd,3,1)
in ('C','Y'); /*
cap-based or style */
if
substr(crsp_obj_cd,3,2)
not in ('YH','YS');
/* exclude hedged, short */
if
si_obj_cd ne 'OPI'; /*
exclude option income */
short_way = 1;
keep
crsp_fundno short_way;
run;
proc
sort data
= funds2 nodupkey; by
crsp_fundno; run;
/* there are
only 85 funds that don't overlap between long way and short way: all
of them are in the CRSP sample (the one
obtained using crsp_obj_cd) but not in mine:
all of them appear to have changed style to
international or are otherwise international */
data
funds3; merge funds funds2
latest_style names;
by
crsp_fundno;
if
long_way = 1 or short_way = 1;
if
long_way = short_way then delete;
run;
/* look for
funds that have flip-flopped their style */
proc
sql;
create
table funds4 as
select a.*, b.crsp_obj_cd
from
funds as a left
join fund_style (keep=crsp_fundno crsp_obj_cd begdt) as
b
on
a.crsp_fundno = b.crsp_fundno
order
by a.crsp_fundno, begdt;
quit;
data
funds4; set funds4;
if
substr(crsp_obj_cd,1,3)
not in ('EDC','EDY')
then flipper = 1;
else
flipper = 0;
run;
proc
sql;
create
table funds5 as
select crsp_fundno,
max(flipper) as flipper from
funds4
group
by crsp_fundno;
quit;
/* identify
index and target date funds and drop them from the sample */
data
funds6; merge funds5 (in=in1)
names;
by
crsp_fundno;
if
in1;
run;
proc
sql;
create
table funds7 (drop=index_fund_flag) as
select * from
funds6 group by
crsp_fundno
having count(index_fund_flag) = 0;
quit;
data
funds8; set funds7;
format
namex $140.;
namex = lowcase(fund_name);
if
max(index(namex,'index'), index(namex,'s&p'))
= 0;
if
max(index(namex,'idx'), index(namex,'dfa'),
index(namex,'program')) = 0;
if
max(indexw(namex,'etf'),index(namex,'exchange
traded'),
index(namex,'exchange-traded'))
= 0;
if
max(index(namex,'target'),index(namex,'2005'),index(namex,'2005'),
index(namex,'2010'),index(namex,'2015'),index(namex,'2020'),index(namex,'2025'),
index(namex,'2030'),index(namex,'2035'),index(namex,'2040'),
index(namex,'2045'),index(namex,'2050'),index(namex,'2055'))
= 0;
drop
namex fund_name;
run;
/* group into
funds using mflinks and exclude flippers: drops
sample slightly. there are
some unusual
observations where some share classes change style while other share
classes in the same
fund do not (e.g., wficn=100001) */
proc
sql;
create
table equity_funds (where=(wficn ne .))
as
select
b.wficn, a.crsp_fundno from
funds8 as a left
join mfl.mflink1 as
b
on
a.crsp_fundno=b.crsp_fundno
group
by b.wficn having
max(flipper) = 0 order
by crsp_fundno;
quit;
/* delete
unneeded datasets */
proc
datasets nowarn
nolist nodetails;
delete
funds funds2-funds8 fund_style latest_style names;
run;
quit;
/********************************************************************************************
PART 3: create a dataset that for every
equity fund WFICN and month dummy MD contains size
TNA, net return RET, and gross return
RRET. Some code below borrows from return_gap.sas sample
program on WRDS, http://goo.gl/DB1iz (you
have to be a WRDS member to access that link)
********************************************************************************************/
data
returns1; merge equity_funds (in=in1)
crsp.monthly_tna_ret_nav (drop=mnav);
by
crsp_fundno;
if
in1 = 1;
if
wficn ne .;
if
mtna < 0
then mtna = .;
mtna = mtna + 0;
mret = mret + 0;
/* retain last
available tna */
retain
tna;
if
first.crsp_fundno or mtna ne .
then tna = mtna;
rename
caldt=date;
drop
mtna;
run;
/* get expense
ratio data required to get gross returns */
proc
sql;
create
table returns2 as
select a.*, b.exp_ratio
from
returns1 (rename=(tna=mtna)) as a left
join crsp.fund_fees as
b
on
a.crsp_fundno=b.crsp_fundno and date between
b.begdt and b.enddt
order
by crsp_fundno, date;
quit;
/* compute gross
returns */
data
returns2; set returns2;
by
crsp_fundno date;
if
exp_ratio = -99 then
exp_ratio=.;
weight = lag(mtna);
if
first.crsp_fundno then weight = mtna;
rret=sum(mret,exp_ratio/12);
drop
exp_ratio;
run;
/* aggregate
multiple share classes */
proc
sort data
= returns2; by wficn date; run;
data
multiclass1 oneclass; set returns2;
by
wficn date;
if
first.date=0 or last.date=0
then output multiclass1;
else
output oneclass;
run;
proc
sql;
create
table multiclass2
as
select wficn, date, sum(mret*weight)/sum(weight) as
mret,
sum(mtna) as
mtna, sum(rret*weight)/sum(weight) as
rret
from
multiclass1 group by
wficn, date;
quit;
data
returns; set oneclass (drop=weight)
multiclass2;
md = &md;
rename
mret=ret mtna=tna;
drop
crsp_fundno date;
run;
proc
sort data
= returns nodupkey; by
wficn md; run;
proc
datasets nowarn
nolist nodetails;
delete
multiclass1 multiclass2 oneclass returns1 returns2;
quit;
/********************************************************************************************
PART 4: compute active weights, AW. The
resulting dataset contains
for each fund (WFICN) and month dummy
(MD), a measure of AW
********************************************************************************************/
proc
sort data
= equity_funds (keep=wficn) out = funds0 nodupkey;
by wficn; run;
proc
sql;
create
table holdings1 as
select a.*,
abs(b.prc)*b.shrout as ME from
holdings
as
a left join crsp.msf
(keep=permno date prc shrout)
as
b on a.permno = b.permno and
a.md = &md;
create
table holdings2 as
select wficn, md, D, D/sum(D) as
w, ME/sum(ME) as wME
from
holdings1 (where=(ME > 0
and D>0))
group by wficn, md;
create
table holdings3 as
select wficn, md, sum(abs(w-wME)) as
aw, N(w) as Nstocks,
sum(D) as
sumD from
holdings2 group by
wficn, md;
quit;
/* for each calendar quarter at some point
during which a fund disclosed its stock holdings,
md is redefined as a month dummy
corresponding to the last calendar month of that quarter;
require at least 15M invested in at least
10 stocks */
data
holdings4; merge funds0 (in=in1)
holdings3;
by
wficn;
if
in1 = 1;
md = ceil((md-1)/3)*3
+ 1;
if
Nstocks >= 10 and sumD >= 15;
run;
proc
sort data
= holdings4 out = aw (keep=wficn md
aw); by md; run;
/* delete some
datasets we don't need anymore */
proc
datasets nowarn
nolist nodetails;
delete
funds0 holdings1-holdings4 holdings equity_funds;
run;
quit;
/* As an aside,
if you are interested in whether AW is highly correlated with active share,
it is not. In this setup with fewer data
restrictions than in the paper, rho=0.26 based on the
1990-2006 sample originally available from
Cremers and Petajisto and rho=0.20 based on
a recently updated sample that extends through
2009). Refer to the paper for details,
but here is piece of code computing the
correlations. You will need to download
activeshare SAS dataset
(http://www.petajisto.net/data.html). Place it in a folder that is
referenced in the libname statement below:
libname
as 'e:\Research\'; *<--- Update with
your path to activeshare dataset;
data as1; set as.activeshare (keep=wficn
rdate activeshare);
date = rdate;
md = ceil((&md-1)/3)*3 + 1;
drop rdate date;
run;
proc sort data = aw out = as2; by wficn
md; run;
data as2; merge as1 (in=in1) as2 (in=in2);
by wficn md;
if in1 = 1 and in2 = 1;
run;
proc sort data = as2; by md; run;
proc corr data = as2 noprint outp = as3
(where=(_type_='CORR') rename=(aw=corr));
var aw;
with activeshare;
by md;
run;
proc means data = as3 mean;
var corr;
run;
If you'd like, you can run the code in
Part 5 using active share rather than aw and
compare the results. Just run this
statement that effectively renames active share
as aw and then you can execute the code
below in part 5 keeping in mind that you
are actually doing the sorts using active
share:
data awx; set as2 (drop=aw);
rename activeshare=aw;
run;
*/
/********************************************************************************************
PART 5: compute returns, betas, etc for
AW-sorted portfolios. Call the following macro
by providing the number of groups you would
like to sort funds into and by specifying
whether to use net return (in which case
you should set ret=ret) or gross return
(in which case you should set ret=rret).
The results are printed into the .lst file
********************************************************************************************/
%macro
get_portfolio_results(groups=,ret=);
proc rank data =
aw out = aw1 (keep=wficn rank md) groups = &groups;
var AW;
by md;
ranks rank;
run;
/* add fund size
at the time of AW calculation and returns over the subsequent 12 months
you can modify the length of holding period here
if you want by changing 12 to, say, 3 */
proc sql;
create table aw2 as select a.*,
b.tna from aw1 as a left join returns as b
on a.wficn = b.wficn and a.md = b.md;
create table aw3 as select a.*,
b.md as mdret, b.&ret as ret
from aw2 as a left join returns as b on
a.wficn = b.wficn and 1 <= b.md-a.md <= 12
order by a.wficn, a.md, b.md;
quit;
data aw3; set
aw3;
by wficn md;
retain w;
lagret = lag(ret);
if first.md then w = tna;
else w = w * (1+lagret);
drop lagret tna md;
run;
proc means data
= aw3 nway noprint;
var ret;
class rank mdret;
weight w;
output out = aw4 (drop=_type_ _freq_)
mean=ret;
run;
/* high - low
portfolio (for later convenience, it's dubbed rank = 99) */
proc sql;
create table aw5 as select a.mdret, 99
as rank, a.ret-b.ret as ret
from aw4 (where=(rank=%eval(&groups-1)))
as a left join aw4 (where=(rank=0))
as b
on a.mdret = b.mdret order by a.mdret;
quit;
data aw5; set
aw4 aw5; run;
/* the code below
computes unconditional alphas only; for conditional alphas, it's
straightforward to get conditioning
variables and modify the code below
If you want to modify the sample period,
this is the place to do it. */
proc sql;
create table aw6 as
select a.*,
date, ret-rf as retmrf, mktrf as rm, hml, smb, umd
from aw5 as a left join ff.factors_monthly
as b
on a.mdret = &md order by rank, date;
quit;
data aw6; set
aw6;
if rank = 99
then do;
retmrf = ret;
end;
run;
proc reg data =
aw6 noprint adjrsq tableout
outest = params (keep=rank intercept rm
hml smb umd _type_ _adjrsq_
where=(_type_ in ('T'
'PARMS' 'PVALUE')));
model retmrf = rm hml smb umd;
by rank;
output out=IVol1 (keep=rank r) r=r;
run;
data params; set
params;
if _type_ = 'PARMS'
then do;
intercept = intercept*1200;
R2 = _adjrsq_ * 100;
end;
run;
/* macro to make
the results look presentable */
%macro
format_regres(var=,name=);
proc transpose data = params out =
&name (drop=_name_);
var &var;
by rank;
id _type_;
run;
data &name; set &name;
format ast $8.;
if pvalue <= 0.01
then ast='***';
else if pvalue <= 0.05
then ast='**';
else if pvalue <= 0.10
then ast='*';
output;
if rank = 99
then do;
rank = 100;
parms = T;
ast = '';
output;
end;
drop T Pvalue;
run;
data &name; set &name;
parms = round(parms*100)/100;
format parms 12.5;
format &name dummy $15.;
if floor(parms) = parms then dummy =
compress(parms)||'.00000';
else dummy = compress(parms)||'00000';
var = compress(substr(dummy,1,10));
dot = index(var,'.');
if abs(parms) < 10
then &name = substr(var,1,dot+2);
else &name = substr(var,1,dot+1);
if parms >= 0
and rank ne 100 then &name = '~'||compress(&name);
&name =
compress(&name)||compress(ast);
if rank = 100
then &name = '['||compress(&name)||']';
if compress(&name) = '-0.00'
then &name = '~0.00';
&name = tranwrd(&name,'~','
');
keep rank &name;
run;
%mend
format_regres;
%format_regres(var=intercept,name=CarhartAlpha);
%format_regres(var=rm,name=Market);
%format_regres(var=hml,name=Value);
%format_regres(var=smb,name=Size);
%format_regres(var=umd,name=Momentum);
%format_regres(var=R2,name=R2);
proc
means data
= IVol1 nway noprint;
var
r;
class
rank;
output
out = IVol2 (drop=_type_ _freq_) std=IVol;
run;
data
results; merge CarhartAlpha market
value size momentum r2 IVol2;
by
rank;
IVol = IVol * 100;
r2 = tranwrd(r2,'***','');
if
rank = 100 then
r2 = .;
format
IVol 4.2;
format
Portfolio $11.;
Portfolio = compress(substr(rank+1,1));
if
rank = 0 then
Portfolio = 'Low';
if
rank = %eval(&groups-1)
then Portfolio = 'High';
if
rank = 99 then
Portfolio = 'High - Low';
if
rank = 100 then
Portfolio = 't-statistic';
run;
%if
&ret=ret %then %do;
title
"Results Based on Net Returns of &groups AW-Sorted
Portfolios";
%end;
%if
&ret=rret %then %do;
title
"Results Based on Gross Returns of &groups AW-Sorted
Portfolios";
%end;
proc
print data
= results noobs;
var
Portfolio CarhartAlpha Market Value Size Momentum R2 IVol;
run;
title;
proc
datasets nowarn
nolist nodetails;
delete
aw1-aw6 CarhartAlpha market value size momentum r2 ivol1 ivol2 params results;
run;
quit;
%mend
get_portfolio_results;
/* call the
macros*/
%get_portfolio_results(groups=10,ret=ret);
%get_portfolio_results(groups=10,ret=rret);