Gateway to Think Tanks
来源类型 | Working Paper |
规范类型 | 报告 |
DOI | 10.3386/w26517 |
来源ID | Working Paper 26517 |
Text Selection | |
Bryan T. Kelly; Asaf Manela; Alan Moreira | |
发表日期 | 2019-12-02 |
出版年 | 2019 |
语种 | 英语 |
摘要 | Text data is ultra-high dimensional, which makes machine learning techniques indispensable for textual analysis. Text is often selected—journalists, speechwriters, and others craft messages to target their audiences’ limited attention. We develop an economically motivated high dimensional selection model that improves learning from text (and from sparse counts data more generally). Our model is especially useful when the choice to include a phrase is more interesting than the choice of how frequently to repeat it. It allows for parallel estimation, making it computationally scalable. A first application revisits the partisanship of US congressional speech. We find that earlier spikes in partisanship manifested in increased repetition of different phrases, whereas the upward trend starting in the 1990s is due to entirely distinct phrase selection. Additional applications show how our model can backcast, nowcast, and forecast macroeconomic indicators using newspaper text, and that it substantially improves out-of-sample fit relative to alternative approaches. |
主题 | Econometrics ; Estimation Methods ; Macroeconomics ; Macroeconomic Models ; Financial Economics ; Portfolio Selection and Asset Pricing ; Financial Markets |
URL | https://www.nber.org/papers/w26517 |
来源智库 | National Bureau of Economic Research (United States) |
引用统计 | |
资源类型 | 智库出版物 |
条目标识符 | http://119.78.100.153/handle/2XGU8XDN/584189 |
推荐引用方式 GB/T 7714 | Bryan T. Kelly,Asaf Manela,Alan Moreira. Text Selection. 2019. |
条目包含的文件 | ||||||
文件名称/大小 | 资源类型 | 版本类型 | 开放类型 | 使用许可 | ||
w26517.pdf(784KB) | 智库出版物 | 限制开放 | CC BY-NC-SA | 浏览 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。