欧美剧情在线一级片,日本高清不卡一二三区,宅男噜噜噜66国产免费

　　現(xiàn)在大家都知道我強烈推薦在DID中進行“平行趨勢敏感性分析”。這是因為Roth（2022，AER：Insights）指出，傳統(tǒng)的處理前趨勢檢驗效力在50%-80%之間。這個效力是什么？這就是今天想給大家介紹的，也是我的第三篇方法論模擬分析文章《因果推斷的效力分析的zui新趨勢與實踐指導(dǎo)》（前兩篇分別是《聽Rubin和Imbens的話：處理配置機制討論的實踐指導(dǎo)》和《如何控制協(xié)變量？DID中協(xié)變量調(diào)整的實踐指導(dǎo)》）。

　　除了Roth的文章外，zui近幾年，效力分析也在Top期刊的應(yīng)用研究中越來越常見，例如，Black et al.（2022，JPubE）、Dench et al.（2024，JPubE）。關(guān)于效力分析更詳細的技術(shù)講解，請參見Sylvain Chabé-Ferret（2024）的計量課本《Statistical Tools for Causal Inference》第七章。

　　下面的內(nèi)容主要來自于Nick Huntington-Klein（2020）：《Simulation for Power Analysis》。

　　我們有了idea后，就要去做許多統(tǒng)計分析。在學(xué)習(xí)計量之初，老師就告訴我們，要盡可能的增加樣本量。效力分析（Power Analysis）就是一種我們至少需要多少樣本量得到給定處理效應(yīng)，或者給定樣本量可以得到zui低處理效應(yīng)是多少。

　　無論你進行哪種研究，效力分析都是一個好主意。然而，它在兩種情況下特別有用：

如果你正在尋找一個你認為可能不是那么核心或重要的影響，即它是一個小的影響，或是一個系統(tǒng)的一部分，在這個系統(tǒng)中，有很多其他的事情正在發(fā)生（粗略地，你可以把它想象成“一個小的 2 ”），進行效力分析可能是一個好主意——了解小效應(yīng)所需的樣本量通常比你預(yù)期的要大得多，zui好現(xiàn)在就了解這一點，而不是在你已經(jīng)完成所有工作之后再做
如果你正在進行隨機實驗，你實際上可以控制樣本量 - 你可以選擇收集多少數(shù)據(jù)以及如何隨機化處理。在進行實驗之前，效力分析至關(guān)重要，這樣你就不會在實驗結(jié)束時才意識到“哦，該死，我應(yīng)該再多幾百人做一次......現(xiàn)在太晚了！”

　　效力分析的作用

　　讓X表示處理，Y表示結(jié)果，假設(shè)我們正在研究X和Y關(guān)系，效力分析設(shè)計以下五件事：

① 處理效應(yīng)的大小
② 處理的變動
③ 其它因素引起Y的變動
④ 統(tǒng)計精度
⑤ 樣本量

　　效力分析就是保持其它四項不變，第五項是多少。例如，如果我們認為效應(yīng)可能是 A，并且X的變動為B，并且Y中有 C 的變動與X無關(guān)，并且你希望至少有 D% 的機會發(fā)現(xiàn)一個效應(yīng)（如果確實存在），那么你需要至少 E 的樣本量。這告訴我們獲得足夠統(tǒng)計效力所需的zui小樣本量是E。

　　或者如果你有一個樣本量為 A，并且X的變動為 B ，并且Y有 C 的變動與X無關(guān)，并且你希望至少有 D% 的機會發(fā)現(xiàn)一個效應(yīng)（如果它確實存在），那么這個效應(yīng)必須至少與 E 一樣大。這告訴我們可檢測到的效應(yīng)的下界，即在給定樣本大小的情況下，你希望有機會合理測量的zui小效應(yīng)。

　　那么，“統(tǒng)計精度”是什么呢？通常，你有一個目標(biāo)統(tǒng)計效力水平（因此稱為“效力分析”）。統(tǒng)計效力是真實率。也就是說，如果確實存在影響，并且抽樣變動意味著你有 80% 的機會拒絕給定樣本中無效的零假設(shè)，那么你就有 80% 的統(tǒng)計效力。統(tǒng)計效力還取決于你正在運行的測試類型 - 如果你在 95% 的置信水平下進行假設(shè)檢驗，則與在 99% 的置信水平下進行假設(shè)檢驗相比，你更有可能拒絕零假設(shè)（因此統(tǒng)計效力更高）。

　　效力分析并不一定要考慮統(tǒng)計效力。我們可以從以上五項的任何一個角度來進行效力分析，例如標(biāo)準(zhǔn)誤。

　　這五項數(shù)值的來源

　　為了進行效力分析，我們需要知道五個數(shù)值中的四個，這樣效力分析才能告訴我們第五個。

　　另一個問題是效力標(biāo)準(zhǔn)是多大？統(tǒng)計效力越高，我們錯過真實處理效應(yīng)的可能性越小，但這也意味著我們需要更大的樣本。一般的經(jīng)驗來看，80%的效力是一個標(biāo)準(zhǔn)，但現(xiàn)在越來越多的研究者使用90%的效力。

　　效力分析的模擬

　　第yi步：構(gòu)造模擬數(shù)據(jù)集

　　我們模擬1000個樣本，處理變量X是0,1上的均勻分布，處理效應(yīng)是0.2，標(biāo)準(zhǔn)誤是0-3的正態(tài)分布。stata代碼如下：

* Set the number of observations we want in our simulated dataclear* Create our data* Note 0 and 1 are the default start and end of runiform; I want that, so I don't put anything inside runiform()g X = runiform()* Now create Y based on X* Don't forget to add additional noise!* The *true effect* of X on Y is .2g Y = .2*X + rnormal(0, 3)

　　第二步：執(zhí)行分析

　　假設(shè)我們計劃獲得回歸中的穩(wěn)健標(biāo)準(zhǔn)誤：

* reg is short for regressreg Y X, robust* And if we're just interested in significance, say at the 95% level,* we can compare the p-value to .05 and store the result as a local variable  (i.e. just store the single number, not as a regular Stata variable)local sig = 2*ttail(e(df_r),abs(_b[X]/_se[X])) <= .05* di is short for displaydi `sig'

　　第三步：重復(fù)上述過程

forvalues i = 1(1)500 {  * Set the number of observations we want in our simulated data  clear  set obs 1000    * Create our data  g X = runiform()  g Y = .2*X + rnormal(0, 3)    * Run analysis quietly  qui reg Y X, robust    * Pull out results  local sig = 2*ttail(e(df_r),abs(_b[X]/_se[X])) <= .05  di `sig'  local coef = _b[X]  di `coef'}

　　這個重復(fù)過程將會產(chǎn)生500次數(shù)據(jù)，然后他將獲取X的系數(shù)以及95%置信水平上是否顯著。

　　第四步：儲存結(jié)果

　　我們首先創(chuàng)建一個數(shù)據(jù)集來存儲結(jié)果。然后，隨著我們反復(fù)進行分析，我們將該preserve結(jié)果數(shù)據(jù)集清除，進行數(shù)據(jù)生成，并將結(jié)果存儲在本地變量中。然后我們restore返回結(jié)果數(shù)據(jù)集并將這些本地變量放入實際變量中。zui后，我們將擁有一個充滿結(jié)果的數(shù)據(jù)集！

clear* We want a number of observations equal to the number of times we will simulate the dataset obs 500* Blank variablesg coef_results = .g sig_results = .* Let's wrap the whole thing in quietly{} - we don't need the output!quietly {  forvalues i = 1(1)500 {        * Preserve our results data set so we can instantly come back to it    preserve        * Set the number of observations we want in our simulated data    * Now we're no longer in the results data set; we're in the simulated-data data set    clear    set obs 1000        * Create our data    g X = runiform()    g Y = .2*X + rnormal(0, 3)        * Run analysis quietly    qui reg Y X, robust        * Pull out results    local sig = 2*ttail(e(df_r),abs(_b[X]/_se[X])) <= .05    local coef = _b[X]        * Now restore to come back to results    restore        * And store our results in the results variables, using "in" to tell it which row to store the data in    replace coef_results = `coef' in `i'    replace sig_results = `sig' in `i'  }}* summ is short for summarizesumm sig_results

　　結(jié)果如下：

　　結(jié)果顯示，6%的統(tǒng)計效力（均值）。

　　第五步：更多的結(jié)果

　　我們還可以考察系數(shù)的分布：

tw kdensity coef_results, xti("Coefficient on X") yti("Density") lc(red)

　　X系數(shù)的密度分布

label def sig 0 "Not Significant" 1 "Significant"label values sig_results siggraph bar, over(sig_results) yti("Percentage")

　　第七步：高級效力分析

　　效力分析還可以獲得更多的信息：（1）我們想要分析不同效應(yīng)下的效力是多少。首先固定1000個樣本，嘗試分析不同效應(yīng)的大小的效力水平：

foreach effect in .4 .8 1.2 1.6 2 {  foreach sample_size in 1000 {    quietly {      clear      * We want a number of observations equal to the number of times we will simulate the data      set obs 500            * Blank variables      g sig_results = .            * Let's wrap the whole thing in quietly{} - we don't need the output!      forvalues i = 1(1)500 {                preserve                clear        * NOTICE I'm putting the "sample_size" variable from above here        set obs `sample_size'                * Create our data        g X = runiform()        * and the "effect" variable here        g Y = `effect'*X + rnormal(0, 3)                * Run analysis quietly        reg Y X, robust                * Pull out results        local sig = 2*ttail(e(df_r),abs(_b[X]/_se[X])) <= .05        local coef = _b[X]                * Now restore to come back to results        restore                * And store our results in the results variables, using "in" to tell it which row to store the data in        replace sig_results = `sig' in `i'      }            summ sig_results      * since we're inside a quietly{} we need a noisily to see this      noisily di "With a sample size of `sample_size' and an effect of `effect', the mean of sig_results is " r(mean)    }  }}

　　因此，看起來我們需要一個介于 0.8 和 1.2 之間的效應(yīng)，才能有 80% 的機會找到顯著的結(jié)果。如果我們認為效應(yīng)實際上不太可能那么大，那么我們需要想其他辦法---找到更大的樣本，使用更精確的估計方法，等等！否則我們可能應(yīng)該放棄。