Proceedings
Best quote: Librarians are like Mr. Paperclip from MS Office - we pop up when you least expect it and try to offer to you help...
This conference focused on the science library's role in supporting e-science and integrating into research collaborations and science departments. There was a mixture of speakers: government, library and institute directors, and a few librarians. The presentations were a mixture of big picture descriptions and some concrete examples. I felt like there wasn't as much hard solutions that we could take back to the library and implement, but perhaps just educating the library community on how radically different e-science is changing the research landscape is the necessary first step.
I've included the highlights from my session notes below (let me know if you'd like the see my full notes in gory detail). Check out the proceedings link above for powerpoint and document files for most of the speakers.
As a side note, our poster about GatorScholar was well-received with many people already aware of the project from either Val's USAIN presentations, the SLA poster, or from hearing about Cornell's project. Medha Devare was one of the panel reactors and she mentioned our collaboration in her presentation. Most of the poster visitors seemed very interested in starting their own version and perhaps at some point we'll have a network of databases.
Thursday
E-Science: Trends, Transformations & Responses
Convener and Moderator: Wendy Lougee, University of Minnesota
Speaker: Chris Greer, Director, National Coordination Office
NCO part of Office of Science and Tech Policy, coordinates all major science orgs
E-Science defined as digital data driven, distributed and collaborative - allows global interaction.
Science pushed to be trans-disciplinary - scientists pushed to areas where they have no formal training - continual learning important;
It fuses the pillars of science: experiment, theory, model/simulation, observation & correlation
Come a long way: ARPANET -> internet, redefinition of the computer (ENIAC to cloud computing)
Question: how many libraries do we need? Greer thinks this will change over time.
Future library: Imagine all text in your pocket, question answered at speed of light (semantic web concept), wearing contact lens merge physical and digital worlds -> in the long run we'll have the seamless merging of worlds
Science is global and thrives in a world that is not limited to 4-D. Cyberinfrastructure reduces time and distance. Need computational capacity and connectivity with information.
The challenge for society: responsibility to preserve data.
Reinventing the library:
Challenges: institutional commitment, sustainable funding model, defining the library user community (collection access is global so who is the user?), legal and policy frameworks, library workforce, library as computational center, sustainable technology framework.
We've come a long way but we're at the beginning of a dramatic change.
E-Science defined as digital data driven, distributed and collaborative - allows global interaction.
Science pushed to be trans-disciplinary - scientists pushed to areas where they have no formal training - continual learning important;
It fuses the pillars of science: experiment, theory, model/simulation, observation & correlation
Come a long way: ARPANET -> internet, redefinition of the computer (ENIAC to cloud computing)
Question: how many libraries do we need? Greer thinks this will change over time.
Future library: Imagine all text in your pocket, question answered at speed of light (semantic web concept), wearing contact lens merge physical and digital worlds -> in the long run we'll have the seamless merging of worlds
Science is global and thrives in a world that is not limited to 4-D. Cyberinfrastructure reduces time and distance. Need computational capacity and connectivity with information.
The challenge for society: responsibility to preserve data.
Reinventing the library:
Challenges: institutional commitment, sustainable funding model, defining the library user community (collection access is global so who is the user?), legal and policy frameworks, library workforce, library as computational center, sustainable technology framework.
We've come a long way but we're at the beginning of a dramatic change.
2. A Case Study in E-Science: Building Ecological Informatics Solutions for Multi-Decadal Research
William Michener, Research Professor (Biology) and Associate Director, Long-Term Ecological Research Network Office, University of New Mexico
Data and information challenges:
data are massively dispersed and lost sometimes
data integration - scientists use different formats and models. Lots of work to integrate even simple datasets
problem of information and storage
LTER has a lot of data archives that are very narrow in scope of data stored. Also has a lot of tools. Working on adoption of tools - predict an exponential increase with time.
Future: science will drive what they do. Look at critical areas in the earth system. Understanding changes in world involve a pyramid in data collection scale (remote sensing to sampling)
Technology directions; Cyberinfrastrcture is enabling the science, consider whole-data-life-cycle, domain agnostic solutions (since budgets are bad, solutions have to be universal across all the sciences)
We need
Cyberinfrastructure that enables: data needs to be able to pull in from different sources, easy integration, tools that allow visualization
Support for the data lifecycle - need to work on metadata interoperability across data holdings.
Sociocultural Directions:
education and training: science now is lifelong learning
engaging citizens in science: have websites to education public,
building global communities of practice: develop CI as a collaborative team
expand globally in future, expand with academic, govt, NGO's and companies
Challenges:
Broad active community engagement: need educators to teach students in best practices
transparent governance
adoption of sustainable business models
data are massively dispersed and lost sometimes
data integration - scientists use different formats and models. Lots of work to integrate even simple datasets
problem of information and storage
LTER has a lot of data archives that are very narrow in scope of data stored. Also has a lot of tools. Working on adoption of tools - predict an exponential increase with time.
Future: science will drive what they do. Look at critical areas in the earth system. Understanding changes in world involve a pyramid in data collection scale (remote sensing to sampling)
Technology directions; Cyberinfrastrcture is enabling the science, consider whole-data-life-cycle, domain agnostic solutions (since budgets are bad, solutions have to be universal across all the sciences)
We need
Cyberinfrastructure that enables: data needs to be able to pull in from different sources, easy integration, tools that allow visualization
Support for the data lifecycle - need to work on metadata interoperability across data holdings.
Sociocultural Directions:
education and training: science now is lifelong learning
engaging citizens in science: have websites to education public,
building global communities of practice: develop CI as a collaborative team
expand globally in future, expand with academic, govt, NGO's and companies
Challenges:
Broad active community engagement: need educators to teach students in best practices
transparent governance
adoption of sustainable business models
3. Rick Luce, Vice Provost and Director of University Libraries, Emory University Libraries
"Making a Quantum Leap to eResearch Support: a new world of opportunities and challenges for research libraries"
Where do we need to go: intelligent grid presence, collaboration support, social software, evaluation and research integrity (plus lots of other areas mentioned)
Dataset & repositories: need to have context of data, curation centers, users want mouse-click solutions and will come up with their own solutions if we don't.
PI's taking more responsibility on projects becoming publishers and curators. Librarians need to take on role of middleware
Researchers want:
information collaboration tools: shared reading, virtual worksapces and whiteboards, webspaces support wikis, data sets, preprints, videos of conference presentations, news
Need information visualization: browse information using maps of concepts, collaboration and citation networks, coauthorship networks, taxonomies, scatter plots of data, knowledge domain visualization
Where do we need to be: systems to facilitate shared ideas, presence, and creation
Individual libraries can't do this - we need collaborations
Challenges: connect newly forming disciplines and newly emerging fields
Libraries work a lot on support layer but we need to get in the workflow layer where we're connected with scientists and coordinate on a multi-institutional structure
Need new organizational structures: hybrid organizations: subject specialists - : intra-disciplinary teams. The future library office -> lives in project space/virtual lab
Need informaticians and informationists (embedded librarians)
What percent of our research library content and services are unique? What % of our budget resource ssupport uniqueness? We need to do something others cannot do or do something well that others do poorly.
Library cooperatives are useful for reducing redundancy. Next phase shift requires an expanded mission of shared purpose.
We fall short on scale, speed, agiliity, and resource, focus. Collective problems require collection action, which requires a shared vision - think cloud computing for libraries
We must do more than aggregate and provide access to shared information: Our job now is to wire people's brains together so that sharing, reasoning, and collaboration become part of everyday work.
Dataset & repositories: need to have context of data, curation centers, users want mouse-click solutions and will come up with their own solutions if we don't.
PI's taking more responsibility on projects becoming publishers and curators. Librarians need to take on role of middleware
Researchers want:
information collaboration tools: shared reading, virtual worksapces and whiteboards, webspaces support wikis, data sets, preprints, videos of conference presentations, news
Need information visualization: browse information using maps of concepts, collaboration and citation networks, coauthorship networks, taxonomies, scatter plots of data, knowledge domain visualization
Where do we need to be: systems to facilitate shared ideas, presence, and creation
Individual libraries can't do this - we need collaborations
Challenges: connect newly forming disciplines and newly emerging fields
Libraries work a lot on support layer but we need to get in the workflow layer where we're connected with scientists and coordinate on a multi-institutional structure
Need new organizational structures: hybrid organizations: subject specialists - : intra-disciplinary teams. The future library office -> lives in project space/virtual lab
Need informaticians and informationists (embedded librarians)
What percent of our research library content and services are unique? What % of our budget resource ssupport uniqueness? We need to do something others cannot do or do something well that others do poorly.
Library cooperatives are useful for reducing redundancy. Next phase shift requires an expanded mission of shared purpose.
We fall short on scale, speed, agiliity, and resource, focus. Collective problems require collection action, which requires a shared vision - think cloud computing for libraries
We must do more than aggregate and provide access to shared information: Our job now is to wire people's brains together so that sharing, reasoning, and collaboration become part of everyday work.
Wendy Lougee
Pitfalls: not to fall back on traditional roles, currently we don't respond to multi-institutional collaborations, our boundaries stop with the institution
We need to understand scientists' workflows, need to identify strategies for embedding librarians into project teams. We need to think about core expertise of librarians, reimaging roles of librarians
What do we do to build this collaborative action? We need to think outside the box.
Data Curation: Issues and Challenges
Convener and Moderator: James Mullins, Dean of Libraries, Purdue University
- Liz Lyon, Director, UKOLN
Transition or Transform? Repositiioning the Library for the Petabyte Era
How can libraries work with science (in a very general sense)?
1. Transition or Transform? Need to become embedded and integrated into team science. Many different models of engagement
Geosciences pilot where the library worked with the Geological department to curate their datasets (Edinborough):
Found: Time needed is longer than anticipated, inventory doesn't have to be comprehensive, little documentation exists
Outcomes: positive, requirement for researcher and auditor training, need to develop a data policy
2. Lots of opportunities of action: leadership by senior managers, faculty coordination, advocacy & tranining, data documentation best practices
People and Skills: there are not enough specialised data librarians. In UK 5 data librarians. Need to bring diverse communities together - facilitate cooperation between organizations and individuals.
Open science: new range of areas where results are being put onto the web (GalaxyZoo eg.) Librarians need to be aware of implications.
3. Need multidisciplinary teams and people in library, huge skill shortage, need to find core data skills and integrate it into the LIS curriculum. Recruit different people to the LIS team, rebrand the LIS career. Go from librarianship to Informatics.
- Fran Berman, Director of the San Diego Supercomputer Center, UC San Diego, and Co-chair Blue Ribbon Task Force on Sustainable Digital Preservation and Access
Researchers are detectives, shows different major questions (SAF, Brown Dwarfs, bridge stress, Income dynamics over 40 years, Disease spread-Protein Data Bank) - key collections all over.
CI Support: all these issues are crucial. researchers want a easy to use set of tools to make the most of their data.
She finds different preservation profiles: timescale, datascale, well-tended to poor, level of policy restrictions, planned vs. ad hoc approach
Researchers focused on new projects, customization of solutions to problems, collaboration
Researchers need help: developing management, preservation and use environments, proper curation and annotation, navigating policy, regulation, IP, sustainability
Questions about preservation: what should we save and who should pay for it? Just saving everything isn't an option. 2007 was the crossover year - digital data exceeded the amount of available storage. What do we want to save? Who is we?
Society: official and historically valuable data, Fed agency or inst normally takes part.
Research community: PDB, NVO.
Me: medical record, financial data, digital photos - real commercial market for preservation solutions.
What do we have to save?
private sector: HIPAA, Sarbanes-Oxley,
OMB regulations for fed funded research data (3 years, not always easy to do).
Economics: many costs associated with preservation. Maintenance upkeep, software, utilities, space, networking, security, etc.
UCSD forged partnership with library. Trying to create a preservation grid with formal policies, nationwide grid with other institutions.
Panel Responders:
- Sayeed Choudhury, Associate Dean of University Libraries and Hodson Director of the Digital Research and Curation Center, Johns Hopkins University
Data Curation Issues and Challenges:
It makes sense to help scientists deal with public and higher levels of data, not the raw data.
Considerations: need to work within their systems, consider gateways for systems as part of infrastructure development (think about railroad gauge), focus on both human and tech components of infrastructure, human interoperability is more difficult than tech interoperability, trust is key!
Questions: What about the cloud or the crowd? Can Flickr help us with data curation? What are the fundamental differences between data and collections? Human readable vs. machine readable? How do we transfer principles into new practices? What are we trying to sustain? Data? Scholarship? Our organizations?
Supporting Virtual Orgs
- Thomas A. Finholt, Director, Collaboratory for Research on Electronic Work (CREW) and Research Professor & Associate Dean for Research and Innovation, School of Information, University of Michigan
Changing nature of geographically-distributed collaboration:
history: transition in terms of distributed work. Much of what came before (collaboratory, video conf) had a precedent but new emerging has no precedent (crowdsourcing, VO's), no traditional context leaves us a bit adrift.
Lesson 1: anticipate cultural differences.
Domain scientists: characteristics: power distance (bias toward seniority, hierarchical), individualist(solo PI, individual genius), masculine(adversial and competitive), uncertainty avoidance
CI developers: power distance (bias toward talent, egalitarian), collectivist(project model), masculine, embrace risk
Lesson 2: plan for first contact.
It can be tough to recognize successful innovations: first efforts are often awkward hybrids
Crowdsourcing: idea that we send out challenges and solutions come to us (ex. Innocentive website, Games with a Purpose). We don't know who is going to do the work, effort is contributed voluntarily -> incentives are important to motivate work
Delegation of organizational work: people can count on organizations to do some of the basic policy work. Much attention has focused on technology and processes to support social ties, alternative course is the use of technology to supplant social ties - > think of this as organizing without the work of organizing, questions of who to trust, who pays, permitted to use the resources are managed by middleware.
Group work is an inevitable fact of org life.
- Medha Devare, Life Sciences and Bioinformatics Librarian, Mann Library, Cornell University
Library contributions: technology choices, tools; tech support/guidance; subject expertise; understanding of research landscape; vision - user needs of the future?
Examples of library support: VIVO, DataStar (supports data-sharing among researchers)
DataStar: Data Staging Repository: supports data sharing, esp during research process, promotes publishing or archiving to discipline specific data centers and/or to Cornell's DR. Nascent stage
Reinventing the library? Librarians as middle-ware to facilitate process of connecting and creating coherence across disciplines - both VIVO and DataStar aid this.
Hope that both tools seamlessly interact with each other.
D. Scott Brandt, Associate Dean for Research, Purdue University Library
Tries to embed librarians in research teams. We have to redefine what we do, collect.
No comments:
Post a Comment