/********************************************************************************************

DESCRIPTION, TERMS OF USE, AND PREVIEW OF THE RESULTS

-----------------------------------------------------

This program reproduces the key result documented in the paper, "Managerial Activeness and Mutual Fund Performance".

If you use the program in any way, I ask that you please cite the paper in your work:

Doshi, Hitesh, Redouane Elkamhi, and Mikhail Simutin, 2015, Managerial Activeness and

Mutual Fund Performance, Review of Asset Pricing Studies, 5 (2), 156-184.

 

For every each fund/portfolio discolosure date, this program calculates

active weight (AW), the fraction of a fund's portfolio holdings that differs from the

value-weighted index of these holdings. It then shows that equity mutual fund managers who

exhibit higher degrees of managerial activeness as proxied for by high active weight

significantly outperform passive managers.

 

This program as a whole or in part should be helpful to many wrestling with mutual funds

data on WRDS. It can be used to replicate the results of our paper and (with slight changes)

the results of many other papers that use similar data. Specifically, parts 1, 2, 3, and 5

should be of use in many other applications.

 

There are five parts to the program:

Part 1 constructs a dataset that contains details on portfolio holdings of mutual funds,

Part 2 identifies diversified domestic actively managed equity fund sample,

Part 3 constructs a dataset collecting returns of the funds

Part 4 computes active weight, AW, and

Part 5 is a macro (function) that assigns funds into portfolios (number of portfolios, e.g.,

       deciles, quintiles can be specified by the user) on the basis of AW and computes

       future raw and risk-adjusted returns (net or gross as specified by the user) for each

       portfolio and for the High-Low portfolio       

 

I tried to keep the program as simple as possible and provide as many comments as possible

to help those who either want to reproduce the results or to those interested in learning

about linking and analyzing different datasets used in mutual fund research. Unlike in the

paper, I do not impose some additional restrictions (e.g., here I do not control for the

incubator bias) just to keep things simpler. The sample period here also ends in 2013 rather

than in 2012. Hence the numbers differ slightly relative to those in the paper.

 

To run this program on WRDS, you need to have subscriptions to Thomson Financial, MFLINK,

Fama and French, and CRSP mutual fund and monthly stock datasets. Specifically, the program

uses the following SAS datasets: tfn.s12type1, tfn.s12type3, crsp.msenames, crsp.msf,

crsp.fund_style, crsp.fund_names, crsp.monthly_tna_ret_nav, crsp.fund_fees,

mfl.mflink1, mfl.mflink2, and ff.factors_monthly.

 

To execute the program, just copy this code into a 'aw.sas' file and upload the file

onto you home (or other) directory on WRDS server and run the following command:

qsas aw.sas

It will take a few minutes to execute and will produce output in 'aw.lst' file in the same

folder as 'aw.sas' Here, for example, are the Carhart alphas (in percent per year) and

loadings of decile portfolios computed (as of today, Sept 23, 2016) using gross returns:

 

Portfolio    CarhartAlpha    Market       Value        Size         Momentum

Low          -1.48**         1.03***     -0.14***      0.15***      0.01

2            -0.50           1.00***     -0.12***      0.20***      0.01

3             0.06           1.03***     -0.07***      0.19***      0.03***

4            -0.28           1.02***     -0.07***      0.14***      0.04***

5             0.15           1.01***     -0.12***      0.11***      0.01

6             0.28           1.01***     -0.11***      0.08***      0.01

7             0.45           1.00***     -0.07***      0.10***      0.02**

8             0.48           1.00***     -0.12***      0.14***      0.01*

9             1.11**         0.98***     -0.09***      0.21***      0.00

High          1.85***        0.94***     -0.08***      0.32***     -0.01

High - Low    3.33***       -0.09***      0.05**       0.17***     -0.02

t-statistic  [3.97]         [-5.25]      [2.08]       [7.43]      [-1.54]

 

As an aside, WRDS occasionally changes datasets like it did just a short while ago with

MFLINKS files, so if you get an error, this is the reason. Look at the log file for details.

 

Please feel free to contact me if you have any questions, Mikhail.Simutin@rotman.utoronto.ca

********************************************************************************************/

 

 

 

 

/* Dates are not perfectly aligned in different WRDS databases. E.g., CRSP monthly files

   often define date as the last trading day of the month whereas some other datasets

   may use the actual last day of the month. This is a little handy way to define a month dummy

   that sequentially numbers months. md = 1 for December 1925 (first month in CRSP) and

   md = 1045 corresponds to December 2012. It's useful in applications below. */

%let md = (year(date)*12 + month(date) - 1925*12 - 11);

 

 

/********************************************************************************************

      PART 1: create a dataset containing the following variables:

      wficn - fund identifier

      md - month dummy indexing months

      permno - stock identifier

      D - dollars invested in a the 'permno' stock

********************************************************************************************/

 

/* Thomson Financial s12type1 file reports two dates: RDATE and FDATE. As of this writing,

   out of 1133304 observations in that file, only 242739 (21%) had the two dates equal.

   RDATE is the date as of which the positions are held by the fund. FDATE is a 'vintage date'

   and is used as a key for joining some databases. Not sure why it is called 'F'DATE, but

   it has nothing to do with a filing date for our intents and purposes. Check out

   http://goo.gl/lMI3l (esp. page 5) for details (you have to be a WRDS member).

 

   As an example, try pulling fundno = 1099 (ROYCE HERITAGE FUND). You will see that for

   many RDATEs (e.g., RDATE = 30JUN1996), there are multiple FDATEs. We'll use this fund later

   in another example.

 

   For a particular FUNDNO-RDATE pair, we now pick the first FDATE because "when a fund's

   holdings are not reported in the next quarter or if the holdings data for a fund fail

   to be collected for any reason, both RDATE and holdings data in other tables are carried

   forward to the next FDATE set, creating cases of 'stale' data." Using the first available

   FDATE for a given FUNDNO-RDATE avoids the staleness. We also exclude funds with

   investment objective codes (IOC) of 1,5,6, and 7: International, Municipal Bonds,

   Bond & Preferred, and Balanced, respectively, just to keep the size of the file down. */

proc sort data = tfn.s12type1 (keep=fundno rdate fdate ioc assets

      where=(ioc not in (1,5,6,7))) out = holdings1 (drop=ioc);

      by fundno rdate fdate;

run;

 

/* there are several funds that report more than once in a given month; for those funds

   we keep only the last report of the month (e.g., fundno = 72600 in Nov 1997) */

data holdings1; set holdings1;

      by fundno rdate;

      if first.rdate;

      date = rdate;

      md = &md;

      drop date;

run;

 

/* 'md' is month dummy that corresponds to the month of RDATE. Note that FUNDNO-FDATE and

   FUNDNO-RDATE are unique identifiers (keys) */

data holdings1; set holdings1;

      by fundno md;

      if last.md;

run;

 

/* MFLINKS file was recently updated to allow more direct linking from Thomson Financial

   Note that fundno+fdate+rdate is a unique identifier (key) in both mfl.mflink2 and hodlings1

 

   If you only have access to the old (pre-June 2013) version of mfl.mflink2 file, then replace

   the following (uncommented) statement that creates table holdings2 with these ones:

      proc sql;

            create table mflink as select wficn, fundno, sdate1,

            case when sdate2 = max(sdate2) then mdy(12,31,2012) else sdate2 end as sdate2 format date9.

            from mfl.mflink2;

 

            create table holdings2 as select b.wficn, a.*

            from holdings1 as a, mflink as b

            where a.fundno = b.fundno and fdate between sdate1 and sdate2

            order by wficn, rdate, assets;

 

            drop table mflink;

      quit;

 

   Note that the previous version of mfl.mflink2 file stops in June of 2010. The above

   statements assume here that the links that are valid at that time remain valid through

   the end of 2012. This is likely accurate for the vast majority of the funds. This assumption

   can be improved upon (tediously) by using portfolio identifiers from CRSP (crsp_cl_grp),

   which are available starting in 1998. */

proc sql;

      create table holdings2 (where=(wficn ne .)) as select b.wficn, a.*

      from holdings1 as a, mfl.mflink2 as b

      where a.fundno = b.fundno and a.fdate = b.fdate and a.rdate = b.rdate

      order by wficn, rdate, assets;

quit;

 

/* there are several instances when wficn-rdate is not unique; in those cases, pick the

   fund with largest assets */

data holdings2; set holdings2;

      by wficn rdate;

      if last.rdate;

run;

 

/* add holdings from the s12type3 dataset: we don't need FUNDNO after this, so drop it */

proc sql;

      create table holdings3

      as select wficn, rdate, md, a.fdate, assets, cusip, shares

      from holdings2 as a, tfn.s12type3 as b

      where a.fdate = b.fdate and a.fundno = b.fundno;

quit;

 

/* link historical CUSIP to CRSP to get unique stock identifier PERMNO */

proc sql;

      create table holdings4

      as select wficn, rdate, md, fdate, assets, permno, shares

      from holdings3 as a, (select distinct ncusip, permno from crsp.msenames

    where not missing(ncusip)) as b

      where a.cusip = b.ncusip;

quit;

 

/* one last thing to do is to adjust shares held by the fund for stock splits, etc. The number

   of shares is reported as of FDATE but the effective holding date is RDATE. If a particular

   stock (PERMNO) had some event such as a stock split which affected the number of shares

   outstanding, the reported number of shares needs to be adjusted.

 

   As an example, let's continue with fundno = 1099 (ROYCE HERITAGE FUND) reference above. Its

   wficn is 105794, and its SEC's Edgar filings are here: http://goo.gl/t5mAi

   On RDATE=March 31 2010, the fund held permno = 76489 (LUFKIN INDUSTRIES), and the number

   of shares reported in s12type3 file is 43,400. However, if we dig up the actual SEC filing

   for that quarter (http://goo.gl/yTkHb), we'll see that the fund actually held only 21,700

   shares which amounted to an investment of $1,717,555. Why the difference? It's because

   Lufkin Industries had a 2:1 split in June 2010, and the number of shares in s12type3 is

   already adjusted for the split. So we need to re-adjust it back before we compute dollar

   investments for every fund-month-stock observation */

proc sql;

      create table holdings5

      as select a.*, b.cfacshr as cfacshr_fdate

      from holdings4 as a, crsp.msf (keep=permno date cfacshr) as b

      where a.permno = b.permno and year(fdate) = year(date) and month(fdate) = month(date);

 

      create table holdings6

      as select a.*, round(shares*cfacshr_fdate/b.cfacshr,1) as shares_adj

      from holdings5 as a, crsp.msf (keep=permno date cfacshr) as b

      where a.permno = b.permno and year(rdate) = year(date) and month(rdate) = month(date);

 

      create table holdings7 (where=(D > 0)) as select wficn, md, a.permno,

      case when shares_adj*abs(prc) > 0 then shares_adj*abs(prc) else shares*abs(prc) end as D

      from holdings6 as a, crsp.msf (keep=permno date prc) as b

      where a.permno = b.permno and year(rdate) = year(date) and month(rdate) = month(date);

quit;

 

/* we can confirm that holdings of ROYCE HERITAGE FUND (wficn=105794) in March 2010 (md=1012)

   of LUFKIN INDUSTRIES stock (PEMRNO=76489) is indeed the correct $1,717,555 */

data _null_; set holdings7;

      if wficn = 105794 and md = 1012 and permno = 76489;

      format D comma12.; put D;

run;

 

/* delete several duplicates to obtain the final holdings file we wanted */

proc sort data = holdings7 out = holdings nodupkey; by wficn md permno; run;

 

/* delete unneeded datasets */

proc datasets nowarn nolist nodetails;

      delete holdings1 - holdings7 mflink;

run; quit;

 

 

 

 

/********************************************************************************************

      PART 2: create a dataset that links CRSP_FUNDNO share class identifiers from CRSP to

      WFICN fund identifier from MFLINK and keep only equity funds. The dataset contains

      WFICN (fund identifier) and CRSP_FUNDNO (share class identifier). There are other ways

      of doing this, and you can certainly use permutations of this. Here, we make use of the

      recently introduced investment style variable from CRSP (crsp_obj_cd) and also use the

      longer approach that relies on other style identifiers. At the end of the day, the two

      approaches give virtually identical rankings, and we make use of both here

********************************************************************************************/

 

/* different investment objective codes cover different time period. These codes are assigned

   by various institutions and cover a range of time periods and funds: Strategic Insight

   objective codes are available between 1992 and 1999, Lipper classifications are present

   from the end of 1999 through 2008, Wiescat codes begin in early 2008, and Wiesenberger

   codes run through 1993. */

 

/* CRSP policies to exclude: note that it is available through 1990 only */

%let crsp_policy_to_exclude='C & I','Bal','Bonds','Pfd','B & P','GS','MM','TFM';

/* lipper classes to include */

%let lipper_class=('EIEI','G','LCCE','LCGE','LCVE','MCCE','MCGE','MCVE',

                           'MLCE','MLGE','MLVE','SCCE','SCGE','SCVE');

%let lipper_obj_cd=('CA','EI','G','GI','MC','MR','SG');

/* strategic insight classes to include */

%let si_obj_cd=('AGG','GMC','GRI','GRO','ING','SCG');

/* weisenberger classes to include */

%let wbrger_obj_cd=('G','GCI','IEQ','LTG','MCG','SCG');

 

/* retain latest style for every crsp_fundno */

data fund_style; set crsp.fund_style;

      format lipper_classX $4.lipper_obj_cdX $3. si_obj_cdX $3. wbrger_obj_cdX $5.;

      by crsp_fundno;

      %macro temp (var=);

            retain &var.X;

            if first.crsp_fundno then &var.X = &var;

            else if &var ne '' then &var.X = &var;

      %mend temp;

      %temp(var=lipper_class);

      %temp(var=lipper_obj_cd);

      %temp(var=si_obj_cd);

      %temp(var=wbrger_obj_cd);

run;

 

data latest_style (rename=(lipper_classX=lipper_class lipper_obj_cdX=lipper_obj_cd

      si_obj_cdX=si_obj_cd wbrger_obj_cdX=wbrger_obj_cd)); set fund_style;

      drop lipper_class lipper_obj_cd si_obj_cd wbrger_obj_cd;

      by crsp_fundno;

      if last.crsp_fundno;

      keep crsp_fundno lipper_classX lipper_obj_cdX si_obj_cdX wbrger_obj_cdX crsp_obj_cd;

run;

 

/* keep most recent name of every share class */

data names; set crsp.fund_names (where=(fund_name ne ''));

      by crsp_fundno;

      if last.crsp_fundno;

      keep crsp_fundno fund_name index_fund_flag;

run;

 

/* domestic equity mutual funds: this borrows from return_gap.sas sample program on

   WRDS, http://goo.gl/DB1iz (you have to be a WRDS member to access that link) */

data funds; set fund_style (keep=crsp_fundno si_obj_cd wbrger_obj_cd policy lipper_class

      lipper_obj_cd where=(policy not in (&crsp_policy_to_exclude)));

      long_way = 1;

      dummy = 0;

      if lipper_class in &lipper_class or lipper_obj_cd in (&lipper_obj_cd) then output;

      else if missing(lipper_class)=1 and missing(lipper_obj_cd)=1 and

            si_obj_cd in &si_obj_cd then output;

      else if missing(lipper_class)=1 and missing(lipper_obj_cd)=1 and missing(si_obj_cd)=1

            and wbrger_obj_cd in &wbrger_obj_cd then output;

      /* note that these are unnecessary: for all but seven funds, crsp_fundnos are

         identified as equity funds without these statements, and none of the seven

         funds that have all dummy = 1 (and no dummy = 0) cases is a domestic

         diversified equity funds, so actually that merging to get avrcs is not needed */

      /* else if missing(lipper_class)=1 and missing(lipper_obj_cd)=1 and missing(si_obj_cd)=1

            and missing(wbrger_obj_cd)=1 and policy='CS' then do;

            dummy = 1; output;end;

      else if missing(lipper_class)=1 and missing(lipper_obj_cd)=1 and missing(si_obj_cd)=1

            and missing(wbrger_obj_cd)=1 and missing(policy)=1 and 80<=avrcs<=105 then do;

            dummy = 1; output; end; */

      keep crsp_fundno long_way;

run;

 

proc sort data = funds nodupkey; by crsp_fundno; run;

 

/* shorter way of identifying objectives using crsp_obj_cd variable */

data funds2; set crsp.fund_style;

      if substr(crsp_obj_cd,1,1) = 'E'; /* equity */

      if substr(crsp_obj_cd,2,1) = 'D'; /* domestic */

      if substr(crsp_obj_cd,3,1) in ('C','Y'); /* cap-based or style */

      if substr(crsp_obj_cd,3,2) not in ('YH','YS'); /* exclude hedged, short */

      if si_obj_cd ne 'OPI'; /* exclude option income */

      short_way = 1;

      keep crsp_fundno short_way;

run;

 

proc sort data = funds2 nodupkey; by crsp_fundno; run;

 

/* there are only 85 funds that don't overlap between long way and short way: all

   of them are in the CRSP sample (the one obtained using crsp_obj_cd) but not in mine:

   all of them appear to have changed style to international or are otherwise international */

data funds3; merge funds funds2 latest_style names;

      by crsp_fundno;

      if long_way = 1 or short_way = 1;

      if long_way = short_way then delete;

run;

 

/* look for funds that have flip-flopped their style */

proc sql;

      create table funds4 as select a.*, b.crsp_obj_cd

      from funds as a left join fund_style (keep=crsp_fundno crsp_obj_cd begdt) as b

      on a.crsp_fundno = b.crsp_fundno

      order by a.crsp_fundno, begdt;

quit;

 

data funds4; set funds4;

      if substr(crsp_obj_cd,1,3) not in ('EDC','EDY') then flipper = 1;

      else flipper = 0;

run;

 

proc sql;

      create table funds5 as select crsp_fundno, max(flipper) as flipper from funds4

      group by crsp_fundno;

quit;

 

/* identify index and target date funds and drop them from the sample */

data funds6; merge funds5 (in=in1) names;

      by crsp_fundno;

      if in1;

run;

 

proc sql;

      create table funds7 (drop=index_fund_flag) as select * from funds6 group by crsp_fundno

      having count(index_fund_flag) = 0;

quit;

 

data funds8; set funds7;

      format namex $140.;

      namex = lowcase(fund_name);

      if max(index(namex,'index'), index(namex,'s&p')) = 0;

      if max(index(namex,'idx'), index(namex,'dfa'), index(namex,'program')) = 0;

      if max(indexw(namex,'etf'),index(namex,'exchange traded'),

            index(namex,'exchange-traded')) = 0;

      if max(index(namex,'target'),index(namex,'2005'),index(namex,'2005'),

            index(namex,'2010'),index(namex,'2015'),index(namex,'2020'),index(namex,'2025'),

            index(namex,'2030'),index(namex,'2035'),index(namex,'2040'),

            index(namex,'2045'),index(namex,'2050'),index(namex,'2055')) = 0;

      drop namex fund_name;

run;

 

/* group into funds using mflinks and exclude flippers: drops sample slightly. there are

   some unusual observations where some share classes change style while other share

   classes in the same fund do not (e.g., wficn=100001) */

proc sql;

      create table equity_funds (where=(wficn ne .)) as

      select b.wficn, a.crsp_fundno from funds8 as a left join mfl.mflink1 as b

      on a.crsp_fundno=b.crsp_fundno

      group by b.wficn having max(flipper) = 0 order by crsp_fundno;

quit;

 

/* delete unneeded datasets */

proc datasets nowarn nolist nodetails;

      delete funds funds2-funds8 fund_style latest_style names;

run; quit;

 

 

 

 

/********************************************************************************************

      PART 3: create a dataset that for every equity fund WFICN and month dummy MD contains size

      TNA, net return RET, and gross return RRET. Some code below borrows from return_gap.sas sample

      program on WRDS, http://goo.gl/DB1iz (you have to be a WRDS member to access that link)

********************************************************************************************/

data returns1; merge equity_funds (in=in1) crsp.monthly_tna_ret_nav (drop=mnav);

      by crsp_fundno;

      if in1 = 1;

      if wficn ne .;

      if mtna < 0 then mtna = .;

      mtna = mtna + 0;

      mret = mret + 0;

      /* retain last available tna */

      retain tna;

      if first.crsp_fundno or mtna ne . then tna = mtna;

      rename caldt=date;

      drop mtna;

run;

 

/* get expense ratio data required to get gross returns */

proc sql;

      create table returns2 as select a.*, b.exp_ratio

      from returns1 (rename=(tna=mtna)) as a left join crsp.fund_fees as b

      on a.crsp_fundno=b.crsp_fundno and date between b.begdt and b.enddt

      order by crsp_fundno, date;

quit;

 

/* compute gross returns */

data returns2; set returns2;

      by crsp_fundno date;

      if exp_ratio = -99 then exp_ratio=.;

      weight = lag(mtna);

      if first.crsp_fundno then weight = mtna;

      rret=sum(mret,exp_ratio/12);

      drop exp_ratio;

run;

 

/* aggregate multiple share classes */

proc sort data = returns2; by wficn date; run;

 

data multiclass1 oneclass; set returns2;

      by wficn date;

      if first.date=0 or last.date=0 then output multiclass1;

      else output oneclass;

run;

 

proc sql;

      create table multiclass2

      as select wficn, date, sum(mret*weight)/sum(weight) as mret,

      sum(mtna) as mtna, sum(rret*weight)/sum(weight) as rret

      from multiclass1 group by wficn, date;

quit;

 

data returns; set oneclass (drop=weight) multiclass2;

      md = &md;

      rename mret=ret mtna=tna;

      drop crsp_fundno date;

run;

 

proc sort data = returns nodupkey; by wficn md; run;

 

proc datasets nowarn nolist nodetails;

      delete multiclass1 multiclass2 oneclass returns1 returns2;

quit;

 

 

 

 

/********************************************************************************************

      PART 4: compute active weights, AW. The resulting dataset contains

      for each fund (WFICN) and month dummy (MD), a measure of AW

********************************************************************************************/

proc sort data = equity_funds (keep=wficn) out = funds0 nodupkey; by wficn; run;

 

proc sql;

      create table holdings1 as select a.*, abs(b.prc)*b.shrout as ME from holdings

      as a left join crsp.msf (keep=permno date prc shrout)

      as b on a.permno = b.permno and a.md = &md;

 

      create table holdings2 as select wficn, md, D, D/sum(D) as w, ME/sum(ME) as wME

      from holdings1 (where=(ME > 0 and D>0)) group by wficn, md;

 

      create table holdings3 as select wficn, md, sum(abs(w-wME)) as aw, N(w) as Nstocks,

      sum(D) as sumD    from holdings2 group by wficn, md;

quit;

 

/*    for each calendar quarter at some point during which a fund disclosed its stock holdings,

    md is redefined as a month dummy corresponding to the last calendar month of that quarter;

    require at least 15M invested in at least 10 stocks */

data holdings4; merge funds0 (in=in1) holdings3;

      by wficn;

      if in1 = 1;

      md = ceil((md-1)/3)*3 + 1;

      if Nstocks >= 10 and sumD >= 15;

run;

 

proc sort data = holdings4 out = aw (keep=wficn md aw); by md; run;

 

/* delete some datasets we don't need anymore */

proc datasets nowarn nolist nodetails;

      delete funds0 holdings1-holdings4 holdings equity_funds;

run; quit;

 

 

/* As an aside, if you are interested in whether AW is highly correlated with active share,

   it is not. In this setup with fewer data restrictions than in the paper, rho=0.26 based on the

   1990-2006 sample originally available from Cremers and Petajisto and rho=0.20 based on

   a recently updated sample that extends through 2009). Refer to the paper for details,

   but here is piece of code computing the correlations. You will need to download

   activeshare SAS dataset (http://www.petajisto.net/data.html). Place it in a folder that is

   referenced in the libname statement below:

 

      libname as 'e:\Research\';  *<--- Update with your path to activeshare dataset;

     

      data as1; set as.activeshare (keep=wficn rdate activeshare);

            date = rdate;

            md = ceil((&md-1)/3)*3 + 1;

            drop rdate date;

      run;

 

      proc sort data = aw out = as2; by wficn md; run;

 

      data as2; merge as1 (in=in1) as2 (in=in2);

            by wficn md;

            if in1 = 1 and in2 = 1;

      run;

 

      proc sort data = as2; by md; run;

 

      proc corr data = as2 noprint outp = as3 (where=(_type_='CORR') rename=(aw=corr));

            var aw;

            with activeshare;

            by md;

      run;

 

      proc means data = as3 mean;

            var corr;

      run;

 

      If you'd like, you can run the code in Part 5 using active share rather than aw and

      compare the results. Just run this statement that effectively renames active share

      as aw and then you can execute the code below in part 5 keeping in mind that you

      are actually doing the sorts using active share:

      data awx; set as2 (drop=aw);

            rename activeshare=aw;

      run;

*/

 

 

 

/********************************************************************************************

    PART 5: compute returns, betas, etc for AW-sorted portfolios. Call the following macro

    by providing the number of groups you would like to sort funds into and by specifying

    whether to use net return (in which case you should set ret=ret) or gross return

    (in which case you should set ret=rret). The results are printed into the .lst file

********************************************************************************************/

%macro get_portfolio_results(groups=,ret=);

proc rank data = aw out = aw1 (keep=wficn rank md) groups = &groups;

      var AW;

      by md;

      ranks rank;

run;

 

/* add fund size at the time of AW calculation and returns over the subsequent 12 months

   you can modify the length of holding period here if you want by changing 12 to, say, 3 */

proc sql;

      create table aw2 as select a.*, b.tna from aw1 as a left join returns as b

      on a.wficn = b.wficn and a.md = b.md;

 

      create table aw3 as select a.*, b.md as mdret, b.&ret as ret

      from aw2 as a left join returns as b on a.wficn = b.wficn and 1 <= b.md-a.md <= 12

      order by a.wficn, a.md, b.md;

quit;

 

data aw3; set aw3;

      by wficn md;

      retain w;

      lagret = lag(ret);

      if first.md then w = tna;

      else w = w * (1+lagret);

      drop lagret tna md;

run;

 

proc means data = aw3 nway noprint;

      var ret;

      class rank mdret;

      weight w;

      output out = aw4 (drop=_type_ _freq_) mean=ret;

run;

 

/* high - low portfolio (for later convenience, it's dubbed rank = 99) */

proc sql;

      create table aw5 as select a.mdret, 99 as rank, a.ret-b.ret as ret

      from aw4 (where=(rank=%eval(&groups-1))) as a left join aw4 (where=(rank=0)) as b

      on a.mdret = b.mdret order by a.mdret;

quit;

 

data aw5; set aw4 aw5; run;

 

/* the code below computes unconditional alphas only; for conditional alphas, it's

   straightforward to get conditioning variables and modify the code below

   If you want to modify the sample period, this is the place to do it. */

proc sql;

      create table aw6 as

      select a.*, date, ret-rf as retmrf, mktrf as rm, hml, smb, umd

      from aw5 as a left join ff.factors_monthly as b

      on a.mdret = &md order by rank, date;

quit;

 

data aw6; set aw6;

      if rank = 99 then do;

            retmrf = ret;

      end;

run;

 

proc reg data = aw6 noprint adjrsq tableout

      outest = params (keep=rank intercept rm hml smb umd _type_ _adjrsq_

      where=(_type_ in ('T' 'PARMS' 'PVALUE')));

      model retmrf = rm hml smb umd;

      by rank;

      output out=IVol1 (keep=rank r) r=r;

run;

data params; set params;

      if _type_ = 'PARMS' then do;

            intercept = intercept*1200;

            R2 = _adjrsq_ * 100;

      end;

run;

 

/* macro to make the results look presentable */

%macro format_regres(var=,name=);

      proc transpose data = params out = &name (drop=_name_);

            var &var;

            by rank;

            id _type_;

      run;

 

      data &name; set &name;

            format ast $8.;

            if pvalue <= 0.01 then ast='***';

            else if pvalue <= 0.05 then ast='**';

            else if pvalue <= 0.10 then ast='*';

            output;

            if rank = 99 then do;

                  rank = 100;

                  parms = T;

                  ast = '';

                  output;

            end;

            drop T Pvalue;

      run;

 

      data &name; set &name;

            parms = round(parms*100)/100;

            format parms 12.5;

            format &name dummy $15.;

            if floor(parms) = parms then dummy = compress(parms)||'.00000';

            else dummy = compress(parms)||'00000';

            var = compress(substr(dummy,1,10));

            dot = index(var,'.');

            if abs(parms) < 10 then &name = substr(var,1,dot+2);

            else &name = substr(var,1,dot+1);

            if parms >= 0 and rank ne 100 then &name = '~'||compress(&name);

            &name = compress(&name)||compress(ast);

            if rank = 100 then &name = '['||compress(&name)||']';

            if compress(&name) = '-0.00' then &name = '~0.00';

            &name = tranwrd(&name,'~',' ');

            keep rank &name;

      run;

%mend format_regres;

%format_regres(var=intercept,name=CarhartAlpha);

%format_regres(var=rm,name=Market);

%format_regres(var=hml,name=Value);

%format_regres(var=smb,name=Size);

%format_regres(var=umd,name=Momentum);

%format_regres(var=R2,name=R2);

 

proc means data = IVol1 nway noprint;

      var r;

      class rank;

      output out = IVol2 (drop=_type_ _freq_) std=IVol;

run;

 

data results; merge CarhartAlpha market value size momentum r2 IVol2;

      by rank;

      IVol = IVol * 100;

      r2 = tranwrd(r2,'***','');

      if rank = 100 then r2 = .;

      format IVol 4.2;

      format Portfolio $11.;

      Portfolio = compress(substr(rank+1,1));

      if rank = 0 then Portfolio = 'Low';

      if rank = %eval(&groups-1) then Portfolio = 'High';

      if rank = 99 then Portfolio = 'High - Low';

      if rank = 100 then Portfolio = 't-statistic';

run;

 

%if &ret=ret %then %do;

      title "Results Based on Net Returns of &groups AW-Sorted Portfolios";

%end;

%if &ret=rret %then %do;

      title "Results Based on Gross Returns of &groups AW-Sorted Portfolios";

%end;

 

proc print data = results noobs;

      var Portfolio CarhartAlpha Market Value Size Momentum R2 IVol;

run;

title;

 

proc datasets nowarn nolist nodetails;

      delete aw1-aw6 CarhartAlpha market value size momentum r2 ivol1 ivol2 params results;

run; quit;

%mend get_portfolio_results;

 

/* call the macros*/

%get_portfolio_results(groups=10,ret=ret);

%get_portfolio_results(groups=10,ret=rret);