Collaborative Development

TL;DR: I used gource to visualize the ATLAS trigger code development activity in SVN as a way to illustrate the size and the collaborative development environment of the experiment.

 

It’s hard to convey to a general audience the size and scope of particle physics experiments. Not just the sheer size of the detectors, but the small army of scientists who are all working together. The modern incarnation of particle physics is a highly collaborative effort. The ATLAS masthead, the list of all the authors who get credited on publications, is about three thousand names long and the list of active member, who are affiliated in some way, shape, or form with the experiment, has over fourty-seven hundred entries. This may be small on the scale of big companies (Microsoft has something like ninety thousand employees), but it is absolutely massive on the scale of academic research.

A while ago I cam across gource (http://code.google.com/p/gource/), a tool for “software version control visualization” which renders project activity in many popular repository systems including SVN, which ATLAS uses. It’s rendered output does a good job of representing what I so frequently have trouble conveying about our field. As an example, I pulled the log files for the SVN used by the trigger group (which is one of the areas I’m most involved in), cleaned them up a bit, and passed them to gource. The output below shows the activity in early 2002 when the SVN was started, in 2009 right before collisions were first slated to begin at the LHC, and in late 2012 in the middle of the latest and highest intensity running of the LHC.  I scrubbed the video of most identifying information (user names, directory names, etc…), but I think the point still comes across.

http://moderncentral.com/media/gource_nodirs_notags_nonames.webm


Each branch in the video represents a directory, the colored dots are files, and the little icons zooming around are users making commits to the SVN repository. Every time I watch this video I’m struck by how much cross-pollination is going on.

It’s All About the Datastructures — Stupid

TL;DR: There were some rookie mistakes lurking in some python code which I hunted down with profiling (using cProfile), reducing execution time by an order of magnitude.

 

This post is a little story about some profiling I did on python code which ultimately revealed that a poor choice of datastructures had a significant impact on code performance. The problem was in some python code which I use to look at trends in physics topics posted to arxiv.org (more on the output of the tool in another post). After pulling the data from arxiv, the script works out word frequency and identifies increased occurences. The parsing of the corpus was taking an excessively long time, my laptop fan would start whirling, and the system would slow to a crawl. Clearly, something very non-optimal was happening.

My first thought was that the issue was in a part of the code that handles the histograming of the occurences, but nothing obvious jumped out at me when I reviewed the code. Enter the python profilers. These things are gret, they’re dead simple to run, and the results are relatively easy to parse. Of the three profilers the python documentation recommends, cProfile is my favorite.
Continue reading

The Job Market

TL;DR: The academic job market is looking bleaker than ever, and name brand recognition counts for a lot.

 

  • Step 1: College Student, study hard.
  • Step 2: Graduate Student, study hard, do research.
  • Step 3: Postdoc, work hard, more reasearch, prove your chops.
  • Step 4: Proffesor, kick back and relax, you’re set for life.

Ah, if only.

The postings have begun for this year’s set of junior and tenure track faculty positions. This affects me directly, since I’m somewhere around step 3.5. The truth of the matter, is that the job market for faculty positions is amazingly competitive, with few offerings and a sea of prospective applicants. A while ago, stuck at home while both Laureline and Jenn were sick as dogs, I had a little fun with the information up on the HEP Rumor Mill. The HEP Rumor Mill is exactly what it sounds like, completely unverified rumors about who has been short listed or made an offer high energy particle physics faculty jobs. They’ve got one page per year, going back to the 2004-2005 job cycle. All it requires to get the data is writing quick and dirty unstructured data parser. Ithought I’d share some of what I found. Note that all of what comes out of this is highly suspect. The short-lists and offers are un-verified, the names of candidates are sometimes misspelled, and I even caught one of two instances of someone’s affiliation being improperly reported.


Continue reading

Higgs Public Lecture

TL;DR: I gave a public lecture about the Higgs to 500 people and felt like a rock star.

 

While back at SLAC for the Summer Institute, I had the opportunity to give a public lecture about the Higgs. The target audience was the general public, which in the bay area still seems to skew towards the technologically knowledgeable: high-school students with an interest in science, silicon valley types, retired engineers, etc…

SLAC has posted the video to youtube here: http://www.youtube.com/watch?v=QuZJFzFKLwo

I prepare slides on what seems like a daily basis for work. Needless to say, this was a very different experience. The technical level was substantially different, and it gave me more of an opportunity to play with entertaining graphics and videos. Frankly, I think it all came out quite nicely, and the audience feedback was overwhelmingly positive.

There were over 500 people in the audience, apparently a new record. The organizers had to prepare spill-over space outside of the Panofsky auditorium (where I was presenting) and opened a second auditorium. According to the security guards, they still had to turn away about 50 cars when they ran out of space. For ~60 minutes I got to feel like a rock-star! 😉