Don’t you ever wish you could just open your favorite browser, type a question (in natural language, not computer speak) in the input box, then wait a millisecond for the right answer? Or, better yet, just turn on the computer and verbally ask a question and wait for a response (think the Starship Enterprise here)? Spencer Tracy and Katharine Hepburn dealt with the “information search & retrieval” problem in the movie “The Desk Set” from 1957 in which a giant computer was brought in to supplement a staff of librarians, with the thought that the computer would aid in efficiency. The computer took your question, submitted on a sheet of paper, did some crunching in the background, then spit out the answer - correctly (most of the time). Maybe even better than that would be something like a virtual librarian (think the one in Neal Stephenson’s 1992 book Snow Crash), an avatar who takes your question then pilfers through, presumably, yottabytes of data in milliseconds and gives an answer. Of course, the avatar (nothing more than code brought to life) is incapable of thinking, which is where the real problem lies. We haven’t realized Stephenson’s or Gene Roddenberry’s vision yet, but plenty of folks are working on it. To get a glimpse of the current status on this front as well as where we might be headed, check out “The Ultimate Answer Machine” in the Aug. 6th issue of InformationWeek or read it online here (same article, different title).
Archive for the ‘Data management and mining’ Category
For all the extraterrestrial fans out there and in recognition of the 60th Anniversary of the Roswell incident, it seems only natural to take a look at the latest in public distributing computing. Purchasing a supercomputer (or time on one) is one way to perform research that requires heavy computational power. Another is to utilize the idle time and computing power of broadband connected, public PCs. One such project (and one you Roswell buffs should appreciate) would be the SETI@home effort, established in 1999 to use PC computing cycles to detect radio signals from space. If interested, here’s a pretty good primer on the project, though dated a little. But many projects are cropping up of a humanitarian nature to take advantage of the growing number of personal computers worldwide. Interested in letting your own computer help in cancer research, climate change research, etc? This site might be of interest.
If you’re interested in the previous post about emerging organizational structures and their challenges in utilizing/managing the lifecycle of data, especially within academia, then the upcoming issue of CTWatch Quarterly should be of interest. Among the articles in the upcoming issue is one by Herbert Van de Sompel and Carl Lagoze in which digital interoperability within scholarly communication will be the focus. Expect some very interesting discussion and information about the Object Re-Use and Exchange (ORE) project of the Open Archives Initiative (OAI).
In a post from last month (http://www.ctwatch.org/blog/archives/digital-monographs), the digitization of books by Google was mentioned. Amazon and Microsoft are both in the picture as well. Bringing information to the masses, especially in the form of published material, is taking on new levels of salience with many web-based businesses (especially the book publishing industry). This article on book digitization revisits the issue. What’s not being mentioned much is the role of the hardware in the effort. E-books aren’t new nor are the technologies created to view them. But e-books have never really caught on, and a big reason is the display technology. Palm, Sony, and Philips Electronics are just three players who have tested the e-book waters, but the display technology still can’t compensate for the high contrast of print, at least not that’s widely portable and affordable. And haptics still hasn’t produced a replacement for people’s comfort with paper.
NSF recently awarded a group of universities $10 million over five years to set up and operate a grid that will allow researchers and students to access physics data produced by the Large Hadron Collider at CERN in Geneva, Switzerland. The Data Intensive Science University Network, or DISUN for short, will provide access to results from the Compact Muon Solenoid (CMS) experiment, which will account for a portion of the petabytes of data produced by the Collider annually. The CMS effort will also contribute to other grid projects including the Open Science Grid.
More detailed information about the project can be found in Supercomputing Online’s story about DISUN from last week.
The latest issue of D-Lib Magazine has an interesting commentary on the future of digital libraries by Clifford Lynch, Executive Director of the Coalition for Networked Information. Tracing the evolution of digital libraries since the 1960s, his article examines some of the more recent accomplishments and concludes with a list of some of the more interesting issues facing digital library research. Digital libraries play an integral role in cyberinfrastructure but are often underemphasized compared to more glamorous components such as supercomputers and fast networks.