Phases
An essential task of software engineering and data engineering is to have a common framework that allows projects to be compared to each other, maintain the quality of the output, and reap the benefits of experience in systemic ways.
With this in mind, it was decided to work on the CRISP-DM framework. This cyclical methodology has around 30 years of maturity in the market supporting different data-driven industries.
To facilitate its understanding and application, we propose to summarize this model in 4 phases and 3 cycles described below:
Phase 1
Data Requirements
Objective: To understand the needs of the business or the user in relation to data.
Key activities:
- Identify the objectives of the visualization (What do you want to communicate or discover?).
- Define the target user (Who will use the visualization?).
- Establish key metrics and relevant indicators.
- Determine the available data sources and their accessibility.


Phase 2
Data Design and Development
Objective: To transform the data into a suitable format for evaluation and use along with designing the internal structure.
Key activities:
- Data cleansing, transformation, and enrichment.
- Data architecture: tables, files, pipelines, and outputs.
- Formulate construction and advanced metrics.
- Design of custom visualizations.

Phase 3
Data Evaluation
Objective: To validate the usefulness, clarity and accuracy of solutions before they are deployed.
Key activities:
- Review with key users or stakeholders.
- Integration testing: Does the data connect to each other?
- Interpretation tests: Does the visualization communicate what is expected?
- Adjustments to the design, scales, filters, and more.


Phase 4
Data Solution Deployment
Objective: To publish and maintain the data solution as a useful and accessible tool.
Key activities:
- Installation of programs in a production environment
- Enrolling users with security restrictions
- Integration into dashboards, reports or interactive platforms.
- Establishment of automatic update mechanisms

Cicles
These phases are applicable cyclically and incrementally as the project progresses, for example, when deploying a visualization on a datawarehouse we have the following:
Cycle 1 Exploratory analysis | Cycle 2 Data Architecture | Cycle 3 Effective Visualization | |
---|---|---|---|
Phase 1 Data Requirements | The customer’s needs and sources of information to use are listed | It expresses how to organize data so that data sources meet the expected visualization | The expected visualization sketches for each user profile are detailed. |
Phase 2 Data Preparation and Design | A data profile is applied to each source of information separately, looking for clues | Crafting tables, dimensions, formulas, and pipelines to transform data | Each visualization is created using visual design recommendation. |
Phase 3 Data Evaluation | The expectation is compared against the actual feasibility of the data. | Data quality and data warehouse performance are evaluated | The quality of the visualization is evaluated together with the user for different scenarios |
Phase 4 Data Solution Deployment | A quick report is delivered with the statistical findings, shortcomings and improvements to be applied | A datawarehouse is installed available for advanced visualizations | Installation of visualizations with security roles and connection to productive data. |
With this model to follow, we hope that the client will
obtain traceability and transparency in software construction projects and feel
greater confidence by delegating their tasks to a methodical and responsible
work team.