java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters when using national characters

Arturas M picture Arturas M · Aug 27, 2016 · Viewed 17.2k times · Source

I'm trying to create some directories which have national symbols like "äöü" etc. Unfortunately I'm getting this exception whenever that is being attempted:

java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters: /home/pi/myFolder/löwen
        at sun.nio.fs.UnixPath.encode(UnixPath.java:147)
        at sun.nio.fs.UnixPath.<init>(UnixPath.java:71)
        at sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:281)
        at java.nio.file.Paths.get(Paths.java:84)
        at org.someone.something.file.PathManager.createPathIfNecessary(PathManager.java:161)
...
        at java.lang.Thread.run(Thread.java:744)

My code where it occurs looks like this:

public static void createPathIfNecessary(String directoryPath) throws IOException {
        Path path = Paths.get(directoryPath);
        // if directory exists?
        if (!Files.exists(path)) {
            Files.createDirectories(path);
        } else if (!Files.isDirectory(path)) {
            throw new IOException("The path " + path + " is not a directory as expected!");
        }
    }

I searched for possible solutions and most suggest to set the locale to UTF-8, so I thought I would get this fixed if I set the locale in Linux to UTF-8, but I found out that it has already been UTF-8 all the time, and despite newly setting it, I'm still having the same problem.

 $ locale
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

I'm not having this problem on Windows 7, it creates the directories perfectly, so I'm wondering whether I need to improve the java code to handle this situation better, or to change something in my Linux.

The Linux I'm running it on is a Raspbian on a Raspberry Pi 2:

$ cat /etc/*-release

    PRETTY_NAME="Raspbian GNU/Linux 7 (wheezy)"
    NAME="Raspbian GNU/Linux"
    VERSION_ID="7"
    VERSION="7 (wheezy)"
    ID=raspbian
    ID_LIKE=debian
    ANSI_COLOR="1;31"
    HOME_URL="http://www.raspbian.org/"
    SUPPORT_URL="http://www.raspbian.org/RaspbianForums"
    BUG_REPORT_URL="http://www.raspbian.org/RaspbianBugs"

I am running my application on a Tomcat 7 Server (Java version is 1.8 I believe), my setenv.sh starts with: export JAVA_OPTS="-Dfile.encoding=UTF-8 ...

Does anybody have a solution to this problem? I need to be able to use those national symbols in directory/file names...

EDIT:

After adding the extra option Dsun.jnu.encoding=UTF-8 at the start of my setenv.sh for Tomcat and restarting something changed.

Currently the my start of setenv.sh looks like this

export JAVA_OPTS="-Dsun.jnu.encoding=UTF-8 -Dfile.encoding=UTF-8 

it seems like this exception is gone and the folder with the national symbols gets created, however the problem seems to not be solved completely, whenever I try to create/write to files within that directory, I now get:

java.io.FileNotFoundException: /home/pi/myFolder/löwen/Lowen.tmp (No such file or directory)
        at java.io.FileOutputStream.open(Native Method)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:206)
        at java.io.FileOutputStream.<init>(FileOutputStream.java:156)
        at org.someone.something.MyFileWriter.downloadFiles(MyFileWriter.java:364)
        ...
        at java.lang.Thread.run(Thread.java:744)

The code where it happens looks like this:

// output here
File myOutputFile = new File(filePath);
FileOutputStream out = (new FileOutputStream(myOutputFile));
out.write(bytes);
out.close();

It seems to fail on (new FileOutputStream(myOutputFile)); when it's trying to initialize the FileOutputStream with the File object, which has the path created from a string which was retrieved from the path in the exception above and an added filename at the end.

So now the directory is created, however writing or creating anything inside it still results in the exception above, although the file inside it doesn't event contain national symbols.

Creating paths and files in them when they have no national symbols works as perfectly as it did before the change in setenv.sh, so it looks like the problem is connected to the national symbols within the path still...

Answer

Krzysztof Kaszkowiak picture Krzysztof Kaszkowiak · Aug 27, 2016

If the national characters are hardcoded in your source, convert the source file to the same encoding. You can use vim:

vim SourceClassWithHardcodedCharacters.java
:set fileencoding=utf-8<Enter>
:w<Enter>

If there is an issue, you will get a message ("unmappable character (...)").

For me, the issue is related either with 1. hardcoding characters in incorrect encoding or 2. losing the encoding somehow during passing the path to the method.