Where science and tech meet creativity.

Today I had two juxtapositions of journal articles. On one hand I had the photo-ready proofs of a journal article on the Astronomy Cast listener survey I submitted to, and had accepted by CAP. On the other hand I had a bunch of journal articles my student was working on data mining for our research (those articles are on the evolution of galaxies in clusters).

I have to admit, my own journal article (which I’ll post links to once it is online), would put me to sleep – it is a dry recitation of facts, figures, numbers and a couple charts. Everything is quantified and potential errors are noted. If you want to replicate my analysis, almost everything you need is there, and the only stuff missing is the exact words folks wrote in the fill in the blank boxes (and federal research guidelines on privacy require I keep that private). My reasons for writing in this dry and over numerical way comes from my research history of mining other people’s work for additional analysis that can come from pooling multiple people’s measurements. When a paper doesn’t state all its numbers in tables, but only gives a summary graph or a verbal description of a graph, it is very very hard to repeat the analysis or re-purpose the data for further science. I recognize that I am overly sensitive this, and acknowledge my papers probably contain an excessive number of tables, but still…

As my student worked hard to find information on the fraction of blue spiral galaxies in galaxy clusters, he was alternating between finding papers that were very friendly and carefully tabulated cluster distance (redshift), magnitude, size (N_30 and R_30), and blue fraction. (Whoot for good result reporting to several teams). Many others, and I’m going to be nice and not name names, discussed in words how they had seen the blue fraction change across multiple clusters across multiple redshifts… But no numbers were stated. GRRRR It is all very nice for me to say that I see a trend in the growth of my lawn as a function of time, and to state the growth seems to be enhanced by rain, but unless I document the change in height with time, and the enhancement in the change with time and rain fall, using at least a plot, well, I’m, for lack of a cleaner way to state it, I’m talking fluff. Numbers and mathematics are the language of science, and they are the only way we have to compare concrete factors.

So, my student, gets the wonderful task of writing emails to journal article authors and requesting raw numbers.  Fun Fun Fun.

Now, I have to admit I understand how this happens. A lot of scientists start from the premise that the most important part of their paper is the discussion of the results from the data analysis (I agree), and they recognize that people will start by reading their abstract, the intro, and the discussion (and that people *might* read how the data was acquired.)  So… Since tables are a bear to create, stuff that can be described in words sometimes gets described in words alone because no one is going to read the table anyway.  And generally, they are right. And all these papers get passed through peer review, so… It must be alright. Right?

Well, no. Peer review is a good system, but it isn’t perfect. We are all supposed to read the papers of our peers and give them constructive comments to help make sure that what is allowed to be published in our professional journals is valid science, presented clearly, with all the needed information – all the required citations, all the required equations, and all the required tables.

But, sometimes referees are just being human and miss things. Professionally, we’re always too busy and too tired. We make mistakes as we fight through 80+ hour work weeks. I personally decided (not on purpose) to mis-grade a question regarding pullies on all of my exams. (I asked how many pulleys they needed and then expected them to write the number of ropes that were needed.) I’d like to think that when I am asked to referee things, that I do it well, but I can see how easy it would be for me or anyone else to read a paper and not notice the one or two paragraphs (out of several pages), or even the one sub-section (of the dozen or so) that describes very well some secondary effect in words but fails to document the effect in numbers.

We are all over worked and we are all tired. We love what we do, so we push ourselves to do everything so that we can to keep doing it. And we are tired, so sometimes we are lazy, and sometimes we write papers that, well, cause my undergrad to send begging emails.

But those are excuses. Somehow, as a community, we need to figure out how to better balance research, writing grant proposals, teaching classes, serving on committees and all the other things we do through better task distribution and time allocation (e.g. work us less and don’t cause us to have 30 twenty-minute breaks a week between classes, meetings, and everything else during which we’re expected to accomplish research). This will really make all of us better professors and scientists. It will improve our teaching – we could be better prepared and less rushed in our creation of notes, tests, and HW. It would improve our research – we’d see things quicker, and maybe even be less lazy in our writing – and it would make starting on grant proposals earlier easier. And, it might mean less sleeping in committee meetings too 🙂

I am grateful for the journal articles written by the people with the time, the energy, and the foresight to include way too many data tables. I wish happy coffee and table typing experiences on everyone else as they work to write their next paper.

Include you data or my undergrad just might email you…

Some of us do look for the data in the papers.