Prepare the Data |
The input data sources supported by iServer's distributed analytics service include the following. When the data is ready, iServer will automatically list the datasets that meet the analysis criteria when creating various distributed analysis tasks.
Relational datasets stored in iServer DataStore
Dataset in big data file share
Share Directory
Distributed File Store HDFS Directory
Datasets Stored in Spatial Database
iServer DataStore is an application that allows you to quickly create a data store from iServer DataStore and associate the data store with iServer. For the method of building an iServer DataStore distributed environment, please refer to Building an iServer DataStore Distributed Environment.
There are two sources of relational data sets in the iServer DataStore:
Create Dataset: On the %datacatalog_uri%/relationalship/datasets resource page, you can create a dataset, but the dataset created here does not have any features. You can add features to the dataset by connecting to iServer DataStore through iDesktop.
Import Dataset: On the %datacatalog_uri%/relationalship/dataimport resource page, you can import datasets. The supported data types for import are CSV, UDB, Workplace,Excel, GEOJSON. After successful import, the imported dataset will be listed in the% datacatalog_uri%/relationalship/datasets resource.
IServer administrators can register CSV files, UDB files, and HDFS directories as iServer's big data file shares. Please refer to the registration method for Big Data File Shares. The dataset in the successfully registered big data file share will appear in the datasets resource of the data directory service and also serve as input data for the distributed analysis service.
Register to CSV of iServer Data files need to be validated before they can be used for distributed analyst service . The verification method is:
If CSV data that is not registered to iServer is used for distributed analyst service, ensure that there is a corresponding CSV storage directory The.meta file that contains the meta information for the CSV data file. Take the sample data "newyork_taxi_2013-01_14k.csv" as an example.meta The contents of the document are:
"FieldInfos": [ { "name": "col0", "type": "WTEXT" }, { "name": "col1", "type": "WTEXT" }, { "name": "col2", "type": "WTEXT" }, { "name": "col3", "type": "INT32" }, { "name": "col4", "type": "WTEXT" }, { "name": "col5", "type": "WTEXT" }, { "name": "col6", "type": "WTEXT" }, { "name": "col7", "type": "INT32" }, { "name": "col8", "type": "INT32" }, { "name": "col9", "type": "DOUBLE" }, { "name": "X", "type": "DOUBLE" }, { "name": "Y", "type": "DOUBLE" }, { "name": "col12", "type": "DOUBLE" }, { "name": "col13", "type": "DOUBLE" } ], "GeometryType": "POINT", "HasHeader": false, "StorageType": "XYColumn" }
iServer administrators can register Oracle, PostgreSQL, and PostGIS databases as iServer's spatial databases through the "Register DataStore" function on the "Data" and "Data Registration" pages. Please refer to the Registration Method for Spatial Databases. The dataset in the successfully registered spatial database will appear in the datasets resource of the data directory service and also serve as input data for the distributed analysis service.