Assemble task team to lead sandbox work
|All those interested in participating will be encouraged to do so, but a task team will be required to steer the work, ensuring objectives are pursued and processes are documented.|| |
Obtain and install necessary hardware, software etc.
- Set up Hortonworks system architecture
- Set up Pentaho suite
- configure Pentaho for Hadoop distribution and version
- test the configuration
Undergo training of task team to ensure familiarity with technical tools and start collaboration between team members
- Utilisation of online documents, tutorials, demonstration videos etc.
- Potential running of a training session (conditional upon hosting and/or financial support from a participating organization), which could be undertaken alongside another Big Data event to save costs for participants.
|Obtain requisite datasets and undertake analyses in sandbox||July-October 2014|
- Obtain and install data sets (minimum of one from each category outlined in preceding section) Note: process of obtaining datasets that are not freely available (whether paid or not) should be begun at the onset of the project, in order to have them available by this stage of the work.
- For each dataset:
- study availability of variables
- analyse the representativeness of the statistical figures
- study other statistical figures available
- produce some statistics
- document all processes and results on an ongoing basis.
|Produce a general model for achieving the goal of producing statistics from Big Data, to communicate effectively with statistical organizations||November-December 2014|
- Document findings
- Incorporate documented results into dissemination materials and activities