![]() ![]() $ mlcp.sh import -host localhost -port 8000 -username user \ # Windows users, see Modifying the Example Commands for Windows The documents inserted into the database have URIs of form /space/bill/data/ filename. Where /path/filename is the full path to the input file, split_start is the byte position from the beginning of the split, and seqnum begins with 1 and increments for each document created.įor example, the following command loads all files from the file systemdirectory /space/bill/data into the database attached to the App Server on port 8000. The document URI from the source database. (The first column, by default).įor a record of the form first,second,third where Column 1 is the id: first If the input is /space/data/, the result is /space/data/big.xml. If the input file is /space/data/big.zip and it contains a directory entry bill/, then the document URI for dream.xml in that directory is: /space/data/big.zip/bill/dream.xml compressed-file-path/path/inside/zip/filename Note that on Windows, the device (c:) becomes a path step, so c:\path\file becomes /c:/path/file. The following table summarizes the default behavior with several input sources: You can use options to generate different URIs for details, see Transforming the Default URI. Command line options are available for you to modify this behavior. Loading content from the local filesystem can create different URIs than loading the same content from a ZIP file or archive. The default database URI assigned to ingested documents depends on the input source. For example, you cannot convert XML input into a JSON document just by setting -document_type json. Your input data should match the stated document type. You cannot use mlcp to perform document conversions. For details, see How mlcp Determines Document Type. If the document type is not explicitly set with these input file types, mlcp uses the input file suffix to determine the type. MarkLogic Server supports text, JSON, XML, and binary documents. The -document_type option controls the database document format when -input_file_type is documents or sequencefile. In addition, for some input formats, input can come from either compressed or uncompressed files ( -input_compressed). When the input file type is documents or sequencefile you must consider both the input format ( -input_file_type) and the output document format ( -document_type). RDF/JSON is not supported.Īs in the database: XML, JSON, text, and/or binary documents. For details, see Supported RDF Triple Formats in the Semantic Graph Developer's Guide. Serialized RDF triples, in one of several formats. Compression is bound up with the value class you use to generate and import the file. However, the contents can be compressed when you create the sequence file. XML, text or binary controlled with these options: No (archives are already in compressed format) XML, JSON, text, or binary controlled with -document_type.Īs in the database: XML, JSON, text, and/or binary documents, plus metadata. The following table provides a quick reference of the supported input file types, along with the allowed document types for each, and whether or not they can be passed to mlcp as compressed files. All other input file types represent composite input formats which can yield multiple database documents per input file. ![]() The default input type is documents, which means each input file or ZIP file entry creates one database document. This option controls if/how mlcp converts the content into database documents. Use the -input_file_type option to tell mlcp the format of the data in each input file (or each entry inside a compressed file).
0 Comments
Leave a Reply. |