Skip to main content


Showing posts from March, 2018

Reading list: some recent Ed evaluation and Stata articles/links of interest

Image from @AcademiaObscura #EdEval #EdResearch #Stata Quick roundup of some recent links to some education evaluation/research and/or Stata-related things I've been reading ( or have bookmarked to read soon ). >>From the latest DeptEd NVCS 'data point' report:   "Students who reported repetition and power imbalance were components of the bullying they experienced were also more likely to agree that bullying had an impact on various aspects of their lives” >> Matt Welch (AIR) says "Need to align #edeval criteria to standards for student learning #teachereval" >> Chaisemartin and D’HaultfΕ“uille have a new paper about relaxing the treatment effect homogeneity assumption in difference-in-difference analyses (dubbed Fuzzy DID). Ungated version here:

Why ( •_•)>⌐■-■ / (⌐■_■) when you can 😐➡️😎 ? Using Emoji (and other unicode chars) in Stata

The other day I was importing email and text messages on my Mac in Stata from the .db files that they are stored in so that I could do some analysis on how often I text/email certain people. I've never been able to get odbc to work on my Mac in Stata, so I cheated with Sqlite3 and imported with:    foreach tbl in chat chat_msg_join message handle chat_handle_join { ! sqlite3 ~/Library/messages/chat.db ! sqlite3 .tables ! sqlite .headers on ! sqlite .mode csv ! sqlite .output /users/ebooth/`tbl'.csv ! sqlite SELECT * FROM `tbl' ; ! sqlite .quit } to get my texts.  Merging tables from chat.db is another (very long) story.   I opened these csv files in Excel to inspect them and noticed that Emoji from the texts were missing (I assumed they'd be converted into some sort of unicode or hex equivalent), but when I imported these messages into Stata I was surprised to see something like the picture to the right where Emoji were being included in Stata

Happy Ο€ day! Estimating Pi by graphing random numbers with Stata.

  √-1 2^3  Ξ£  Ο€ and it was delicious (because it was of the pizza variety). My obligatory March 14 Pi  post involves estimating Pi by finding the proportion of randomly plotted points inside a square that are also inside a circumscribed circle. The larger the number of points we plot using this method, the closer to Pi we get (sort of Pi estimation by  simulation in Stata (I know, I know, it should have been programmed in PYthon)). You can approximate a Pi calculation via boring methods like: pressing the Ο€ on your TI-83 calculator ,  calculating fractions like 22/7 or 355/113 ,   calculating the log(6)^log(5)^log(4)^log(3)^log(2),  or whatnot. e.g., . di  22/7 3.1428571 . di  355/113 3.1415929 . di  log(6)^(log(5)^(log(4)^(log(3)^log(2)))) 3.1415774 or just cheat and plot  circles via -tw- functions or -graph pie-, etc in Stata. However  (assuming you aren't Yasumasa Kanada and Daisuke Takahashi from the U of Tokyo with server cycles to spare)  anothe

Implications of NCAA march madness brackets with multiplier scoring (a Stata example)

This year at our office we are again helping contribute to the great March Madness economic productivity drain . However, we are also toying with the idea of switching to a  Fibonacci  sequence of bracket scoring and using multiplier scoring to weight up more risky selections (particularly in later rounds).  In prior years, we did this whole thing on paper / manually, and so to keep things simple we followed a standard scoring (pts per round) regime which was the standard espn/yahoo format (e.g., 1-2-4-8-16-32 points across the 6 rounds) .   This year most people in our office seem supportive of a scoring scheme that incentivizes risk-taking.  So, there are essentially two changes to our scoring scheme. First , we are changing to a 2-3-5-8-13-21 progression. With the more traditional 1-2-4-8-16-32 system, the championship game is worth 32x as much as any first-round game (which in effect makes the first round games almost mostly useless). With Fibonacci scoring, the last