BloJJ

Adventures of a multidimensional freak

This is Juan Julián Merelo Guervós English-language blog. He teaches computer science at the University of Granada, in southern Spain. Come back here to read about politics, technology, with a new twist

Latest comments

  • Natasha en Riddles in Kafka on the shore
  • Cb en Riddles in Kafka on the shore
  • Dan Brown Freak en Spanish mostly pissed off at Dan Brown's Digital Fortress
  • Jack en Riddles in Kafka on the shore
  • Anónimo en About conference poster design and defense
  • Hendo en Riddles in Kafka on the shore
  • TML en Riddles in Kafka on the shore
  • Anonymous en Riddles in Kafka on the shore
  • RonS en Riddles in Kafka on the shore
  • miss en Riddles in Kafka on the shore
  • Blogs Out There

    Nelson Minar's Blog
    Jeremy Zawodny's Blog
    Kottke
    Complexes, Carlos Gershenson's blog
    IlliGAL, Genetic Algorithms blog
    Blogging in the wind, Víctor R. Ruiz's blog


    Atalaya, my Spanish language blog
    Geneura@Wordpress, our research group's blog.
    My home page

    Old stories


    Creative Commons License
    This work is licensed under a Creative Commons License.
    Blogalia

    Stats
    Inicio > Historias > Automatic Detection of Trends in Text Streams: An Evolutionary Approach

    Automatic Detection of Trends in Text Streams: An Evolutionary Approach

    I just uploaded to Arxiv a paper I co-wrote with Lourdes Araújo, who is currently at the Universidad Complutense de Madrid. The paper tries to extract some structure from text streams following Kleinberg, and uses histories posted in this very site you're reading for tests and experimentation. It's currently under review in a journal. Here's the abstract:
    This paper presents an evolutionary algorithm for modeling the arrival dates of document streams, which is any time-stamped collection of documents, such as newscasts, e-mails, IRC conversations, scientific journals archives and weblog postings. This algorithm assigns frequencies (number of document arrivals per time unit) to time intervals so that it produces an optimal fit to the data. The optimization is a trade off between accurately fitting the data and avoiding too many frequency changes; this way the analysis is able to find fits which ignore the noise. Classical dynamic programming algorithms are limited by memory and efficiency requirements, which can be a problem when dealing with long streams. This suggests the application of alternative search methods which allow for some degree of uncertainty to achieve tractability. Experiments have shown that the designed evolutionary algorithm is able to reach the same solution quality as those classical dynamic programming algorithms in a shorter time. We have also explored different probabilistic models to optimize the fitting of the date streams, and applied these algorithms to infer whether a new arrival increases or decreases interest in the topic the document stream is about.

    2006-01-12 20:13 | 1 Comment(s) | Filed in

    Referencias (TrackBacks)

    URL de trackback de esta historia http://blojj.blogalia.com//trackbacks/36498

    Comentarios

    1
    De: rvr Fecha: 2006-01-12 20:34

    Interesting :)



    Nombre
    Correo-e
    URL
    Dirección IP: 23.20.157.174 (f8ae440b99)
    Comentario

    © 2002 - 2008 jmerelo
    Powered by Blogalia