Prepare the Data

Feedback


The input data sources supported by iServer's distributed analytics service include the following. When the data is ready, iServer will automatically list the datasets that meet the analysis criteria when creating various distributed analysis tasks.

iServer DataStore

iServer DataStore is an application that allows you to quickly create a data store from iServer DataStore and associate the data store with iServer. For the method of building an iServer DataStore distributed environment, please refer to Building an iServer DataStore Distributed Environment.

There are two sources of relational data sets in the iServer DataStore:

Big Data File Share

IServer administrators can register CSV files, UDB files, and HDFS directories as iServer's big data file shares. Please refer to the registration method for Big Data File Shares. The dataset in the successfully registered big data file share will appear in the datasets resource of the data directory service and also serve as input data for the distributed analysis service.

Register to CSV of iServer Data files need to be validated before they can be used for distributed analyst service . The verification method is:

  1. Enter the data registration page, http://localhost:8090/iserver/admin-ui/data/dataRegistration, if the file directory is registered, the orange exclamation mark in the "Status" column indicates that the file has not been verified;
  2. You can directly click "Verify" in the action bar, or click the corresponding "Store ID", open the dataset list, and click "Verify";
  3. Specify the X and Y indexes in the "Data Meta Info" pop-up box;
  4. Click "OK". When the status changes to a green check mark, it means verified successfully.

If CSV data that is not registered to iServer is used for distributed analyst service, ensure that there is a corresponding CSV storage directory The.meta file that contains the meta information for the CSV data file. Take the sample data "newyork_taxi_2013-01_14k.csv" as an example.meta The contents of the document are:

    "FieldInfos": [
        {
            "name": "col0",
            "type": "WTEXT"
        },
        {
           "name": "col1",
            "type": "WTEXT"
        },
        {
            "name": "col2",
            "type": "WTEXT"
        },
        {
            "name": "col3",
            "type": "INT32"
        },
        {
            "name": "col4",
            "type": "WTEXT"
        },
        {
            "name": "col5",
            "type": "WTEXT"
        },
        {
            "name": "col6",
            "type": "WTEXT"
        },
        {
            "name": "col7",
            "type": "INT32"
        },
        {
            "name": "col8",
            "type": "INT32"
        },
        {
            "name": "col9",
            "type": "DOUBLE"
        },
        {
            "name": "X",
            "type": "DOUBLE"
        },
        {
            "name": "Y",
            "type": "DOUBLE"
        },
        {
            "name": "col12",
            "type": "DOUBLE"
        },
        {
            "name": "col13",
            "type": "DOUBLE"
        }
    ],
    "GeometryType": "POINT",
    "HasHeader": false,
    "StorageType": "XYColumn"
}

Spatial Database

iServer administrators can register Oracle, PostgreSQL, and PostGIS databases as iServer's spatial databases through the "Register DataStore" function on the "Data" and "Data Registration" pages. Please refer to the Registration Method for Spatial Databases. The dataset in the successfully registered spatial database will appear in the datasets resource of the data directory service and also serve as input data for the distributed analysis service.