Showing posts with label io. Show all posts
Showing posts with label io. Show all posts

Thursday, December 13, 2012

File.setLastModified & File.lastModified

Have observed interesting behavior of File.lastModified file property on Linux. Basically, my problem was that I was incrementing the value of that property by 1 in one thread and monitoring the change in the other thread. And apparently no change in property's value happened, the other thread did not see increment. After some time trying to make it work, I realized that I have to increment it at least by a 1000 to make the change visible.

Wondering why that is happening, I have had a look at JDK source code and that's what I found:

JNIEXPORT jlong JNICALL
Java_java_io_UnixFileSystem_getLastModifiedTime(JNIEnv *env, jobject this,
                                                jobject file)
{
    jlong rv = 0;

    WITH_FIELD_PLATFORM_STRING(env, file, ids.path, path) {
        struct stat64 sb;
        if (stat64(path, &sb) == 0) {
            rv = 1000 * (jlong)sb.st_mtime;
        }
    } END_PLATFORM_STRING(env, path);
    return rv;
}

What happens is that on Linux File.lastModified has 1sec resolution and simply ignores milliseconds. I'm not an expert in Linux programming, so not sure is there any way get that time with millisecond resolution on Linux. Assume it should be possible because 'setLastModified' seems like is working as it is expected to work - sets modification time with millisecond resolution (you can find the source code in 'UnixFileSystem_md.c').

So, just a nice thing to remember: when you work with files on Linux, you may not see change in File.lastModified when it's value updated for less than 1000ms.

Wednesday, November 9, 2011

Java file flushing performance

There are many situations when it is required to ensure that data was written to the disk and write is also required to be fast. The most most common where it has to happen are databases, journalling, etc. Also, it is often required to update some random position in a file. I specifically what to place emphasis on random access here, as far as the rest will cover just cases where it is supported, i.e. I'm not going to mention OutputStream.flush() & related topics. Just haven't tried it, as far as that wasn't my case at the moment.

There are several way of flushing the data to disk in java. These options can be quite different in the way they implemented internally and in their performance. Here is the list of existing things you can do:
  • FileChannel.force()
  • 'rws' or 'rwd' mode of RandomAccessFile, which 'works much like the force(boolean) method of the FileChannel class' (from javadoc).
  • MappedbyteBuffer.force()
  • RandomAccessFile.getFD().sync()
  • any close() method. Here I mean seek and close stream each time when access is required. Doing tests, I actually didn't seek, as far was updating data with zero offset.
Surprisingly (the only unsurprising exception is close()) all these methods gives very different performance and it varies almost randomly on different OSes and file systems. Worth noticing that hardware can also put it's correction on the performance of any of these methods. I have also a strong feeling that performance may vary even with minor change in OS or JVM version number. Here is the table with time it takes to flush 8bytes (keep in mind, that the real amount of flushed data depends on the size of caches and going to be much more that 8byes), just to give a flavour of how different is that:

RandomAccessFile.
getFD().sync()
RandomAccessFile, rwd mode MappedbyteBuffer.force()FileChannel.force()
Windows 0.2818ms0.0125ms 0.007ms 0.139ms
Linux 0.5354ms 0.5144ms 0.4663ms 0.0093ms

Please, do not treat these numbers as any relevant result. They are here just to give an example how these things can vary.

So, what the conclusion? Conclusion is that if you would need to write high-performance application which does lots of IO, you really need to test different approached on different OSes, on different file systems and, preferably  on different JVMs. Do not expect something to be fast on Linux (Solaris, AIX, etc) production box, when it is fast on your Windows (Linux, etc) workstation and vice versa. As can be seen, the difference can be in orders of magnitude.