Adventures of a multidimensional freak

This is Juan Julián Merelo Guervós English-language blog. He teaches computer science at the University of Granada, in southern Spain. Come back here to read about politics, technology, with a new twist

Latest comments

  • ahhhhhhh en Problem solved: disabled wireless Internet
  • ODOM en Problems with git and Google Code?
  • Ricardo en Problems with git and Google Code?
  • DAUGHERTY en Problems with git and Google Code?
  • TRAVIS en Problems with git and Google Code?
  • Frenzy Sportswear en About conference poster design and defense
  • TRAVIS en Problems with git and Google Code?
  • en Algorithm::Evolutionary 0.62_2 released
  • en First sightings of the R word
  • a en Spanish blogosphere in Wired
  • Blogs Out There

    Nelson Minar's Blog
    Jeremy Zawodny's Blog
    Complexes, Carlos Gershenson's blog
    IlliGAL, Genetic Algorithms blog
    Blogging in the wind, Víctor R. Ruiz's blog

    Atalaya, my Spanish language blog
    Geneura@Wordpress, our research group's blog.
    My home page

    Old stories

    Creative Commons License
    This work is licensed under a Creative Commons License.

    Inicio > Historias > 19th century scientific publishing

    19th century scientific publishing

    Even as we are going into the 21st century, scientific publishing hasn't changed much. You still have to write paper in two dimensions, black on white, submit it to a print journal (or a congress), and then wait for ages before it's published (or rejected).
    There's been some change on the journals: now most journals accept papers electronically, and they publish it also on the web afterwards, most usually in a pay-per-view basis. I'm not going to get into the ethics of this, it's just the way it is now. The only problem is that the only added value to the web version is hyperlinks to referenced papers (if they are on the same site).
    But things could change quite a lot more. For starters, scientific results are inherently hyperlinked: there should be some universal way of referring to a paper, so that an hyperlink to that paper could be automatically inserted in the final version of the paper. Then, the web version should be the default version, not a kind of afterthought; besides, web-only journals, nowadays, aren't really very well considered, which makes no sense, since a print version can be obtained straight away. Just inserting another dimension by hyperlinking would make researching a subject much more straighforward.
    But there's also something missing in that picture, and it's also something quite inherent to science itself: reproducibility of results. Most of the times, results appear in tables, but they are almost impossible to reproduce (at least, in most computer science papers). The program that produces it uses to be homebrew, or has some part of it homebrew, and it's not available either.
    Why not making datasets and all program sources needed to reach the published result also available with the paper, and, if possible, with a sensible license, GPL-like? It would make much more difficult to duplicate efforts, and, besides, scientists would strive to make science really available to others. Nowadays, you see the same results published over and over, because, sometimes, there's a lot of effort invested in reprogramming an algorithm, or typing down a dataset.
    Then, once we have everything together, program+dataset+text, putting everything into a common format would be the way to go, so that searches would be much easier. It's easy now to make references apart from the rest of the text, and sometimes even abstract, but using an XML format for publishing would allow easy parsing of the text, and even easy comparison of results. Comparing several results would be as easy as making an XPath query. In fact, there's such a thing, STMML: scientific, technical and medical markup language, but I'm not sure how popular it is (probably next to nil; I'd never heard about it).
    I guess there's still a long way to go, meanwhile, just inserting hyperlinks in the papers we publish and making source and datasets available can be an intermediate solution.

    2003-10-27 01:39 | 4 Comment(s) | Filed in

    Referencias (TrackBacks)

    URL de trackback de esta historia


    De: Ctugha Fecha: 2003-10-27 04:57

    Yeah. For some stuff there is really hard to find the original data and soft. For experiments it´s even harder than for simulations (anybody interested in taking a look to the non-analyzed data?). Actually, I have even seen "benchmarks" of models of reading (cognitive science, you know), a trend that began with the Coltheart´s DRC paper. After that, benchmarking the original models with the same data is beginning to be more common in psychology.

    De: JJ Fecha: 2003-10-27 14:51

    For most things, it's almost impossible, even if you ask the original authors. And it should be compulsory.

    De: fernand0 Fecha: 2003-10-27 17:55

    Very good points.

    De: JJ Fecha: 2003-10-27 18:03


    Dirección IP: (85f229ef82)
    ¿Cuánto es: diez mil + uno?

    © 2002 - 2008 jmerelo
    Powered by Blogalia