Project #18

Summer 2015 Student Research

Sub-project A: LDI Replication project

We are looking for up to 4 students to work on the LDI replication project as undergraduate research interns.

Work description

Team members will access journal websites and analyze the materials provided by article authors on how to replicate the article results. The provided materials and instructions will be assessed using a checklist. The instructions will be followed (if possible), and success or failure to (i) perform the analysis (ii) replicate the authors' results will be documented. Team work is encouraged, and activity will be supervised by graduate student or faculty member.

Duration: Summer 2015

Necessary qualifications: Some experience with Stata, SAS, R, Matlab as an undergraduate assistant or in previous work is critical. Having taken one of the CISER workshops, as well as experience with Subversion or Git, are assets.

Location: Presence on campus for a kickoff meeting after classes end (date TBD) is required. Presence on campus is not required for the actual work (all computer work will be performed on remotely accessible servers at CISER and ECCO. Work will be coordinated through a Git repository. Presence on campus for a final workgroup meeting in August is encouraged but not required (though participation via videoconferencing is required if not physically present).

Sub-project B: LDI - NCRN User Interface Accessibility project - This position has been filled.

We are looking for up to 2 students to work on the NCRN metadata project as undergraduate research interns, implementing and testing accessibility features in the web application.

Work description

Research assistants will work with a professional programmer and a faculty member to audit and review multiple components of the CED²AR web application for compliance with accessibility guidelines. They may also be asked to correct some of the identified deficiencies.

Duration: Summer 2015

Necessary qualifications: The ideal candidate has programming experience in Java, XML, web development (CSS, frameworks, JavaScript), and some knowledge about accessibility requirements and standards. Experience with Python, and Subversion or Git are assets. References (courses taken, previous experience) are required.

Location: Presence on campus (Cornell Institute for Social and Economic Research, CISER) for a kickoff meeting after classes end (date TBD) is required. Presence on campus is encouraged, but some work can be done offsite.

Sub-project C: LDI -NCRN Metadata capture project 2015

We are looking for up to 2 students to work on the NCRN metadata project as undergraduate research interns.

Work description

Metadata is “data about data”, and in this case means parsing information from codebook PDFs and raw Stata or CSV files into a metadata standard called DDI (Data Documentation Initiative). Team members  will work with a graduate student and a faculty member to create such data for a variety of social science datasets (Census Bureau restricted-access files, NHIS, etc.) for the Cornell Node of the NSF-Census Research Network, using the CED²AR DDI editing web interface. Once completed, the metadata will be displayed at http://www2.ncrn.cornell.edu/ced2ar-web.

Duration: Summer 2015

Necessary qualifications: The ideal candidate has some experience using Stata, SAS, or CSV files, and some knowledge about social science datasets. Programming knowledge is not required.

Location: Presence on campus for a kickoff meeting after classes end (date TBD) is required. Presence on campus is not required for the actual work. The work itself can be performed off-campus, but daily and weekly videochats will occur. Presence on campus for a final workgroup meeting in August is encouraged but not required (though participation via videoconferencing is required if not physically present).

Sub project D: Creation of a synthetic data product

Goal is to apply existing methodology for synthetic data creation to a public use data product with cell suppression to create a usable data product (useful for geographic or economic or sociologic analysis) without missing cells.

Work description

Statistical agencies protect information on individual respondents (firms, families, individuals) using a variety of techniques. For this project, we will take one such product, which uses "cell suppression" together with complementary suppressions, and improve it by using synthetic data techniques to "fill in the blanks". We will use one of two existing methodologies (developed for use in other contexts), and apply it to a specific context. If successful, the resulting data product will be made available to the scientific community as a value-added statistical data product.

Necessary qualifications:  The ideal candidate has experience in statistics or econometrics, prior experience with R.

Location: Presence on campus for a kickoff meeting after classes end (date TBD) is required. Presence on campus is not required for the actual work. The work itself can be performed off-campus, but weekly videochats will occur. Presence on campus for a final workgroup meeting in August is encouraged but not required (though participation via videoconferencing is required if not physically present).