Is it possible to add SFTP operations in the library? I will soon work on a project where the automated batch needs to obtain files through SFTP.
We’re planning a new feature whereby for all I/O operations you will be able to choose the type of “filesystem” to be used for a specific operation. The filesystems we planning are: FTP, SFTP, ZIP and possibly SCP. It will be couple of weeks before we’ll get to this though.
+1 for being able to read files from within a zip.
Will other compression methods be supported like .gz and .tar.gz? I don’t have a use case for the non zip access yet but it will come in future.
We now have support for the below formats:
Copying from/to:
- local drives
- sftp
- .gz files
Copy from:
- http://
- .zip
- .tar (.tar.gz and .tgz)
There are some interesting scenarios you could easily implement with ORQA:
Read CSV file from a http, process and store in local XLS file
The steps in ORQA would be:
- Read CVS file from http://www.quandl.com/api/v3/datasets/YAHOO/INDEX_DJI.csv
- Filter data so you receive only data from the last 10 years
- Save to a local XLS file
Read CSV file from sftp location, process through a transformation and upload in XLS to another sftp location
The steps in ORQA would be:
- create gzip reference - sftp://host/file.csv.gz, name = gzipInput
- create gzip reference - sftp://host/file-new.xls.gz, name = gzipOutput
- read csv file from =gzipInput
- transform file (e.g. filter for certain values)
- write xls file to =gzipOutput