Data and Text Mining
1. In last year's session we had previously raised the idea of writing a review article that shows the value of data mining for neuroscientists. Today we discussed and expanded this idea in considerable detail. On the one hand, we identified a series of "success stories" of image analysis, genomic analysis, microarray data and text mining, which can be drawn upon for the paper. On the other hand, we also recognized the value of using public data repositories even when no formal data mining techniques are applied to the data — Alzforum, PubMed Central, NeuroDB are just three examples of valuable resources which are not designed primarily for data mining. We also identified a number of impediments that work against the deposition of data into public resources, and/or make it difficult to use the data for data-mining. All in all, the members of the working group felt that it should still be worth-while to oversee the writing of a review article. One possible venue is J. Biomed. Discovery and Collaboration, but the final choice of journal would depend on the final shaping of the article.
Action item: Via the Listserv, members will nominate individuals to be interviewed for the article (those with success stories and those who maintain public data resources) as well as nominate or self-nominate prospective co-authors for the article (trying to keep the co-author list at 6 or less). Neil will create an outline of the review article for comment, and we will proceed to start writing if/when there is consensus that we have enough success to report.
2. In last year's session we had previously raised the idea of reviving the SFN Short Course on Bioinformatics, possibly under Rob Williams, and possibly with the cooperation of SFN to hold it as a satellite event (even though it would no longer be an official short course). This continues to be regarded with enthusiasm.
Action item: Neil will follow up with Rob Williams to check whether he is still willing to supervise such a course. He will also approach one or more Biocomputing Centers to learn whether they will assist in funding the course — with the expectation that the course will largely pay for itself from student registration fees.
3. In last year's session we had previously proposed setting up an online compendium of data and text mining resources. This was, indeed, set up at http://arrowsmith.psych.uic.edu.
Action item: The compendium will continue to be updated and expanded over time, and suggestions from working group members and others are welcome.
In last year's session we had previously raised the idea of hosting a workshop on a data mining related topic — for example, the idea of indexing and representing scientific papers so that they can be more easily data-mined. This was seen as still worthwhile and timely, though the working group itself is not the best group to sponsor or organize such a meeting.
Action item: Neil will seek to identify an interested group to help sponsor the workshop, and an appropriate meeting to which the workshop can be attached.
