TL;DR: I talk about some text frequency analysis I did on the corpus using python, mysql, and R to identify trends and spot interesting new physics results.


In one of my previous posts, I mentioned some optimization I had done on a word-frequency anlysis tool. I thought I’d say a bit more here about the tool (buzzArxiv), which I’ve put together using Python and R to find articles that are creating a lot of ‘buzz’.

For those who don’t know, is an online repository of scientific papers, categorized by field (experimental particle physics, astrophysics, condensed matter, etc…). Most of the pre-print articles posted to the arxiv eventually also get submitted to journals, but it’s usually on the arxiv that the word about new work gets disseminated in the community. The thing is, there’s just gobs of new material on there every day. The experimental, pheonomonolgy, and theory particle physics mailings can each have a dozen or so articles per day.

Continue reading

Collaborative Development

TL;DR: I used gource to visualize the ATLAS trigger code development activity in SVN as a way to illustrate the size and the collaborative development environment of the experiment.


It’s hard to convey to a general audience the size and scope of particle physics experiments. Not just the sheer size of the detectors, but the small army of scientists who are all working together. The modern incarnation of particle physics is a highly collaborative effort. The ATLAS masthead, the list of all the authors who get credited on publications, is about three thousand names long and the list of active member, who are affiliated in some way, shape, or form with the experiment, has over fourty-seven hundred entries. This may be small on the scale of big companies (Microsoft has something like ninety thousand employees), but it is absolutely massive on the scale of academic research.

A while ago I cam across gource (, a tool for “software version control visualization” which renders project activity in many popular repository systems including SVN, which ATLAS uses. It’s rendered output does a good job of representing what I so frequently have trouble conveying about our field. As an example, I pulled the log files for the SVN used by the trigger group (which is one of the areas I’m most involved in), cleaned them up a bit, and passed them to gource. The output below shows the activity in early 2002 when the SVN was started, in 2009 right before collisions were first slated to begin at the LHC, and in late 2012 in the middle of the latest and highest intensity running of the LHC.  I scrubbed the video of most identifying information (user names, directory names, etc…), but I think the point still comes across.

Each branch in the video represents a directory, the colored dots are files, and the little icons zooming around are users making commits to the SVN repository. Every time I watch this video I’m struck by how much cross-pollination is going on.

The Job Market

TL;DR: The academic job market is looking bleaker than ever, and name brand recognition counts for a lot.


  • Step 1: College Student, study hard.
  • Step 2: Graduate Student, study hard, do research.
  • Step 3: Postdoc, work hard, more reasearch, prove your chops.
  • Step 4: Proffesor, kick back and relax, you’re set for life.

Ah, if only.

The postings have begun for this year’s set of junior and tenure track faculty positions. This affects me directly, since I’m somewhere around step 3.5. The truth of the matter, is that the job market for faculty positions is amazingly competitive, with few offerings and a sea of prospective applicants. A while ago, stuck at home while both Laureline and Jenn were sick as dogs, I had a little fun with the information up on the HEP Rumor Mill. The HEP Rumor Mill is exactly what it sounds like, completely unverified rumors about who has been short listed or made an offer high energy particle physics faculty jobs. They’ve got one page per year, going back to the 2004-2005 job cycle. All it requires to get the data is writing quick and dirty unstructured data parser. Ithought I’d share some of what I found. Note that all of what comes out of this is highly suspect. The short-lists and offers are un-verified, the names of candidates are sometimes misspelled, and I even caught one of two instances of someone’s affiliation being improperly reported.

Continue reading

Higgs Public Lecture

TL;DR: I gave a public lecture about the Higgs to 500 people and felt like a rock star.


While back at SLAC for the Summer Institute, I had the opportunity to give a public lecture about the Higgs. The target audience was the general public, which in the bay area still seems to skew towards the technologically knowledgeable: high-school students with an interest in science, silicon valley types, retired engineers, etc…

SLAC has posted the video to youtube here:

I prepare slides on what seems like a daily basis for work. Needless to say, this was a very different experience. The technical level was substantially different, and it gave me more of an opportunity to play with entertaining graphics and videos. Frankly, I think it all came out quite nicely, and the audience feedback was overwhelmingly positive.

There were over 500 people in the audience, apparently a new record. The organizers had to prepare spill-over space outside of the Panofsky auditorium (where I was presenting) and opened a second auditorium. According to the security guards, they still had to turn away about 50 cars when they ran out of space. For ~60 minutes I got to feel like a rock-star! 😉