To address sustained, high-impact challenges that could be significantly shifted through the addition of data scientists and their services DataKind will build an elite Data Science Team comprised of the best professionals in the field of data science. This team will use untapped data sources, new data technologies, and collaborations with social sector experts to confront the world's most intractable challenges.
The team will consist of six full-time data professionals, including data scientists, machine learning experts, data artists, and software developers. The team will work on long-term projects selected in partnership with major stakeholders, such as foundations, governments, and social sector experts. As data science is new to many of these partners, the team will help scope the project alongside the organization, designing a solution that will address their needs.
DataKind projects will create scalable solutions that will benefit the entire field. For example, a project using satellite imagery to monitor migratory patterns will benefit many conservation organizations. Moreover, the algorithms such as these, built for one purpose, will extend to similar problems, such as monitoring refugee movements for more effective aid delivery.
The team will consider projects from a wide range of topics, from financial inclusion to food safety. In DataKind's experience, the success of a project depends far more on the data available than on the specific interest area. The team will initially be hired for three years and will work on various projects that last 9-12 months each, producing five completed projects in total.
DataKind's partners will contribute greatly to the success of this program. Corporate partners will provide financial support and act as thought partners. Foundation partners will provide funding for projects, thought leadership, and act as the first 'clients' of the team. Both partners will share the results of these projects across their networks as case studies so that other organizations can benefit as well.
Over the course of the commitment timeframe, the Data Science team will run five projects with the following milestones:
Sourced: Project partners are continuously sourced and vetted according to DataKind's criteria (commitment to becoming data-informed, organizational buy-in, high-impact theory of change).
Scoped: Promising projects are given a full needs assessment, scoped for time and feasibility, and converted to work plans.
Prototyped: The team delivers the first prototypes and results to the partners, sharing results and giving an opportunity to rescope the project.
Refined: Prototypes are refined, and any new analysis is done. Continuous check-ins with the stakeholders are critical at this point to refine the final product.
Completed: Deliverables are completed. All findings, code, data, and other deliverables are transferred to the stakeholders. Results are shared via communication channels and DataKind's network.
This entire project process will take 9-12 months from Scoped to Completed for each project, with 'Sourced' being an ongoing process.
2014 Q4 (Oct-Dec):
Announce commitment at CGI
2015 Q1 (Jan-March):
Begin data scientist hiring process
Outreach to major academic institutions, industry partners
Scoping potential projects begins
Outreach to existing DataKind partners and data-interested organizations
2015 Q2 (April-June):
Data science hires made
Team is on-boarded and trained
Project One scoped and announced
2015 Q3 (July-Sept):
Data science hires continue (if full team not on board yet)
2015 Q4 (Oct-Dec):
Project Two scoped and announced
Lessons from Year One are shared with the community
2016 Q1 (Jan-March):
Project One concludes
Project Three scoped and announced
2016 Q2 (April-June):
Project Four scoped and announced
2016 Q3 (July-Sept):
Project Five scoped and announced
2016 Q4 (Oct-Dec):
Project Two concludes
Lessons from Year Two are shared with the community
2017 Q1 (Jan-March):
Project Three concludes
2017 Q2 (April-June):
Project Four concludes
2017 Q3 (July-Sept):
Project Five concludes
2017 Q4 (Oct-Sept):
Hold Learning Summit encompassing learning lessons from the past Three years
Share case studies
Write up final reports on the Data Science Team project.
What if the same algorithms that allow Amazon and Walmart to deliver products with radical efficiency could be used to ensure emergency supplies reach disaster zones? What if satellite data could be used to make insurance affordable for millions of subsistence farmers? What if text message analysis could predict malaria outbreaks and save lives?
All of these projects are possible today. Mobile phones, sensors, and new software have created an abundance of data that can be mined, understood, and harnessed to make organizations more effective in almost every sector. Just as the 90's saw computing spread to new non-technical disciplines, every field today is having its 'data moment', making any NGO with a cellphone program just as much a 'data' company as a tech startup. Already, stories are spreading of groups using cellphone data to more accurately estimate poverty, satellite imagery to monitor coastlines, and social media data to prevent flus.
Yet for all of this new technology, the challenges to using data for social good are prominent. Data scientists who can harness the power of data are extremely expensive and hard to find. Most of this select group is employed on Wall Street and in Silicon Valley, unavailable to non-profits and governments. Additionally, the field of data science is so new that many social organizations don't yet understand how this new resource could help them achieve their missions. Therefore data science is not often funded, nor is it clear how it could be applied.
What is needed is a way to bring data science capacity to these rich, untapped sources of data alongside the experts working in these fields. Only when data scientists are available and communicating with the social sector can that data be unlocked for the greater good.
DataKind is pursuing a 12-month data science project with its newly created in-house data science team, likely focused around the issue of public health in the U.S. The purpose is to identify key sector-wide challenges to providing services that could be allayed with novel predictive and data-driven techniques. As such, DataKind is looking for experts and implementing partners in public health and public services to advise on its work. It is also seeking best practices in collective impact models and funders who would support improved public health and social services.
DataKind specializes in providing expert data scientists to mission-driven causes. For CGI members that feel data science could be used to advance their theories of change (i.e. they believe challenging questions could be answered through novel insights derived from new datasets) then DataKind is very open to that conversation. DataKind excels at finding novel data sources, exploring data, answering challenging questions from data, and creating data-driven processes in the name of better organizational decision making. Examples include using 60,000 text message conversations to help Crisis Text Line better understand the needs of teens in crisis in the US, working with HURIDOCS to collect and analyze the European Court of Human Rights' rulings to understand which violations were most prevalent and which rulings were not being enforced, and predicting doctors' responses to patient symptoms at local clinics from SMS messages.