An Outline Of The Data Pipeline
By T. McDonald | 2/07/19 The data pipeline is one way of handling data. This involves acquiring data from a source, or sources, preparing it for use, analysing it, and presenting what was discovered during the analysing to an appropriate audience. Subsequently, there are four stages to the pipeline, which I will outline in this blog: Acquisition Preparation Analyse Presentation Acquisition Before you can do anything, you will need to find some data and determine if it is suitable for the task. This involves legal issues surrounding the data such as its licensing: are you allowed to use it and if so, what are you allowed to do? There may be limitations on the use of the dataset for example. Furthermore, files come in different formats such as CSV or JSON for example. exampleFile.csv Or exampleFile.json Meaning of the extensions: CSV = Coma Separated Values JSON = JavaScript Object Notation The above are just two examples and there will be other types.