Read remote file beginning with "smb://" using R

Joshua Rosenberg picture Joshua Rosenberg · Feb 6, 2017 · Viewed 12.4k times · Source

To read a file in R, I'd normally do something like the following:

read.csv('/Users/myusername/myfilename.csv')

But, I'm trying to read a file located on a remote server (Windows SMB/CIFS share) which I access on my Mac via the FinderGoConnect to Server menu item.

When I view that file's properties, the file path is different than what I'm used to. Instead of beginning with: /Users/myusername/..., it is smb://server.msu.edu/.../myfilename.csv.

Trying to read the file, I tried the following:

read.csv('smb://server.msu.edu/.../myfilename.csv')

But, this didn't work.

Instead of the usual "No such file or directory" error, this returned:

smb://server.msu.edu/.../myfilename.csv does not exist in current working directory

I imagine the file path needs a different format, but I can't figure what.

How can you read this type of file in R?

Answer

stacksonstacks picture stacksonstacks · Dec 6, 2017

Explanation

smb://educ-srvmedia1.campusad.msu.edu/... is actually a URL not a file path.

Let's break this down

smb:// means use the server message block protocol (file sharing)

educ-srvmedia1.campusad.msu.edu is the name of the server

/.../myfilename.csv is the file share/path on the remote server

You are able to navigate to this directory using Finder on OSX because it has built in support for the SMB protocol. Finder connects to the remote service using the URL and allows you to browse the files.

However R has no understanding of the SMB protocol so can't interpret the file path properly.

The R function read.csv() uses file() internally, see https://stat.ethz.ch/R-manual/R-devel/library/base/html/connections.html

url and file support URL schemes file://, http://, https:// and ftp://

So R returns "unable to locate the file" message because the file cannot be found because the protocol is unsupported. yes, slightly confusing.

Fix

You need to mount the file share on your local filesystem.

All this means is that the details of the SMB protocol will be handled behind the scenes by the OS and the fileshare will be presented as a local directory.

This will allow R (and other programs) to treat the remote files for all intents and purposes, like any other local files. This discussion shows some options for doing so.

e.g.

# need to create /LocalFolder first
mount -t cifs //username:password@hostname/sharename /LocalFolder

then in R:

read.csv('/LocalFolder/myfilename.csv')

Extra

Windows users can accomplish this easier with UNC paths
How to read files from a UNC-specified directory in R?