Data at Columbia

Introductory paragraph.

Brainstorm Ideas

Who Can Help Me Initiate a Project?

When planning a research project, the Office of the Executive Vice President for Research offers a Sponsored Projects Certificate program for research administrators. The office also provides guidance for finding funding for research.

Where Can I Find Funding?

Sponsored Projects Administration and the Health Sciences Library provide PIVOT, a tool for identifying funding opportunities. HSL has a tutorial on using PIVOT.

Who Can Help Me Think about Data?

Subject librarians and the Libraries’ Research Data Services team can also help with brainstorming research ideas that involve data. Schedule a consultation with the latter by emailing data@library.columbia.edu.

Capture Data

Who Can Help Me Capture Data for My Project?

Data is never raw; it is always cooked. Similarly, data is never given; it is always taken. For help with capturing data, contact your subject librarian or reach out to the Libraries’ Research Data Services team. Schedule a consultation by emailing data@library.columbia.edu.

Where Can I Find Data?

Research Data Services offers both the Numeric Data Collection and the Geospatial Data Catalog for researchers at Columbia beginning to collect data.

Manage Data

How Can I Create a Data Management Plan?

Managing data and developing a plan for that management are vital aspects of a successful research project.

For grant proposals that need a concrete Data Management Plan, the Office of the Executive Vice President for Research has collected a list of resources of use especially for grant-funded researchers. This includes links to the DMPTool site for drafting plans and links to sample DMPs.

Additionally, the Libraries’ Research Data Services team can consult on Data Management Plans. Schedule a consultation by emailing data@library.columbia.edu.

What Kinds of Software or Computing Environment Can I Use to Manage Data?

CUIT offers several software resources for data management, including:

  • A site license to Globus for secure data transfer
  • A secure data enclave for creating secure virtual desktop research environments
  • A site license to LabArchives, a secure, cloud-based electronic lab notebook
  • Various options for data storage and backups
  • RASCAL, a web application for research compliance and administration
  • A cloud research computing consulting service for help building a cloud computing environment

Additionally, the Libraries’ Research Data Services group can consult on using Git to manage data and projects. Schedule a consultation by emailing data@library.columbia.edu.

How Can I Securely Store My Sensitive Data?

CUIT offers a secure data enclave for creating secure virtual desktop research environments

(more? LabArchives?)

How Can I Make Sure My Data Is Setup/Stored Properly for Reproducibility?

(text on FAIR/reproducibility)

Additionally, the Libraries’ Research Data Services group can consult on using Git to manage data and projects. Schedule a consultation by emailing data@library.columbia.edu.

Process and Analyze Data

Where Can I Get Support for Processing Data?

CUIT’s Research Computing Services team provides support for processing data. They offer workshops on computing for research both in the cloud and on Columbia’s high-performance computing cluster. Similarly, they offer introductory consulting on setting up a cloud computing environment.

Additionally, the Libraries’ Research Data Services group can consult on processing data using the Pandas Python library or using tools like OpenRefine. Schedule a consultation by emailing data@library.columbia.edu.

HSL provides workshops on R and QGIS…. See https://library.cumc.columbia.edu/events for deatils

The Libraries’ Research Data Services group provides several workshops on data analysis. Upcoming workshops are listed on the Workshops, Training, & Events page, but they have also provided workshops on:

  • Panel surveys
  • The Python Pandas library

Finally, RDS can help with one-on-one consultations. Schedule a meeting by emailing data@library.columbia.edu.

What Kinds of Workshops for Data Analysis Are Available?

CUIT offers workshops on computing for research both in the cloud and on Columbia’s high-performance computing cluster.

Similarly, the Health Sciences Library provides data analysis workshops via their events page. Recent workshops include:

  • GIS Special Workshop on Density Mapping

The Libraries’ Research Data Services group provides several workshops on data analysis. Upcoming workshops are listed on the Workshops, Training, & Events page, but they have also provided workshops on:

  • Panel surveys
  • The Python Pandas library
What Kinds of Data Analysis Consultations Are Available?

CUIT’s Research Computing Services team offers introductory consulting on setting up a cloud computing environment.

Additionally, the Libraries’ Research Data Services group can consult on processing data using the Pandas Python library or using tools like OpenRefine. Schedule a consultation by emailing data@library.columbia.edu.

Publish and Share Data

Who Can Help Me Create Metadata for My Data?

How to make the metadata? What about license agreements that have to get signed? Ensuring that the metadata is in good shape.

Publish and Share Data
  • Who can help me create metadata for my data?
  • Who can help me determine where I can publish my data?
  • When should I start thinking about publishing my data?
  • Who do I talk to about copyright for my data?
  • Who can I talk to about the impact factor for my work?
  • Who can I talk to about different options for sharing my data?
  • How can I make sure my data is findable, accessible, interoperable, and/or reusable?

How to make the metadata? What about license agreements that have to get signed? Ensuring that the metadata is in good shape.

https://academiccommons.columbia.edu/faq incorporate from here for the last four sections

When it is time to publish data, the Libraries and CUIT provide many resources:

  • Academic Commons, the Columbia institutional repository, provides global access to the research and scholarship produced at Columbia
  • the Libraries maintain a license to Dryad, a curated resource that makes research data discoverable, freely reusable, and citable
  • the Libraries’ Research Data Services group can consult on discipline-specific repositories for publishing data (ICPSR, Roper, QDR, etc.). Schedule a consultation by emailing data@library.columbia.edu.
  • The Digital Scholarship team in the Libraries are also available for consulting about publishing. Email them at digitalscholarship@library.columbia.edu
  • CUIT provides free web hosting for Columbia researchers with sites.columbia.edu Any data made available for publication or re-use is already implicitly shareable, simply by linking to the relevant dataset’s DOI.

Sharing data as something that is part of the cycle as a whole. Data platform. Think of this as more than just the last step.

Making the data findable/interoperable/

Columbia provides different opportunities for you to make your data reusable:

  • Academic Commons, the Columbia institutional repository, provides global access to the research and scholarship produced at Columbia
  • the Libraries may agree to host and list your data as part of its numeric or geospatial data collections
  • the Libraries maintain a license to Dryad, a curated resource that makes research data discoverable, freely reusable, and citable
  • The Digital Scholarship team in the Libraries are also available for consulting about publishing. Email them at digitalscholarship@library.columbia.edu
  • CUIT provides free web hosting for Columbia researchers with sites.columbia.edu