Skip to main content

readLine terminates earlier for Windows 7 than XP SP2 or 2000 SP4

6 replies [Last post]
tdanecito
Offline
Joined: 2005-10-10
Points: 0

Hi All,

I have an app that reads a file via a URL connection. I noticed recently that for windows xp and 2000 it reads the whole file but for windows 7 it terminates early not finishing the file read. It used to work on Windows 7 32-bit and java 32-bit.

I am using jre 1.6.0_18

Any clues?
Thanks,
-Tony

Reply viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
tdanecito
Offline
Joined: 2005-10-10
Points: 0

Clearing the browser cache fixed the problem. Strange I did use the method for the UrlConnection class to not read from any cache. Kind of a surprise to me this could happen.

Thanks,
-Tony

tdanecito
Offline
Joined: 2005-10-10
Points: 0

I am beginning to answer my own questions from last posting.

Okay according to wikipedia utf-8 (Unicoding) is very different from cp1252. So even though I had changed the StreamInputReader to utf-8 and it did not work I will set it back to that.
Also I found out that line.separator is 0D0A hex for both versions of windows. What I also found out is that will not work for unicode documents. So a developer has to have control over what readLine uses for that very reason. I seem to find that is not the case for java. If true for documents this is a very big issue.

Anyone know how to dynamically set what readLine uses to determine the end of a line?

Thanks,
-Tony

tdanecito
Offline
Joined: 2005-10-10
Points: 0

Allrighty then. Seems a large block of the html file streamed over is skipped over under Windows 7 that is picked up by Windows XP.
No exceptions are thrown so not sure why that would happen.

I did make sure the content type and encoding types are what is expected.

-Tony

tdanecito
Offline
Joined: 2005-10-10
Points: 0

Okay more info. Seems this is getting more important by the hour.

It looks as if the java default settings even though defined in api's for defaults may not be working. I took the example given in this thread and ran it in eclipse under windows 7 32-bit using 32-bit jre 1.6.0_18 and got the same problem where part of the html read in was being skipped over. Then for the project resource setting in eclipse where you can override the container settings for newline (for readLine) and encoding type used by inputstreamreader class for example I was able to fix the issue by toggling from cp1252 to urf-8 and container defualt to windows readline values from container default. When I set back to original settings the failure did not repeat. Actually any settings I tried for the two was successful after the failure went away.

So now I need to know how to reliably set:

1. encoding type at runtime. I think the urlconnection class the getContentType seems reliable it says the html encoding is utf-8 and seperate tools indicate the html content-type is utf-8.

2. The InputStreamReader seems to allow for setting the input char encoding type and convert to UniCode. The default input encoding type seems to be cp1252 but the stream is utf-8 as mentioned so not sure if that should be set to utf-8 or not. Seems for Eclipse this did not matter but not sure if it was being set properly by eclipse or if the jre could be set reliably.

3. Finally the BufferedReader seems to use the jre value for readline but does not seem to have a way set at runtime for unix or windows. I almost got the impression that it can only be set vial commandline at jre startup. Since I use webstart not sure I can control that except through webstart jnlp jre commandline settings.

So the questions are:
1. should cp1252 be used for utf-8 input file (html)?
2. Does the jre take into account all newline types because the file read in may not allways be windows /r /n but may be /n or Linux newline on a windows system.
3. Is there a relaible way to determine content-type using java for an input file? Would that be urlconnection in this case?

Thanks for your thoughts,
-Tony

walterln
Offline
Joined: 2007-04-17
Points: 0

You have a bug in your code? Hard to tell without seeing it :).

tdanecito
Offline
Joined: 2005-10-10
Points: 0

Nope wish it was that simple. Here is the simple sample. The url is a youtube url for a video. Seems to vary since I had a friend run it and it worked on his Windows 7 64-bit with 32-bit jvm. I used 32-bit jvm with 32-bit jre 1.6.0_18.

public void readFile(String url) {
StringBuffer buf = new StringBuffer();

// get the page from the url
try {
//System.out.println("HttpUrlDerive.url:" + url);
URL url = new URL(vlc.path.trim());
URLConnection urlConnection = url.openConnection();
urlConnection.setReadTimeout(2000);
urlConnection.setConnectTimeout(5000);
urlConnection.setUseCaches(false);
BufferedReader d = new BufferedReader(new InputStreamReader(urlConnection.getInputStream()));

int data = d.read();
while(data != -1){
char theChar = (char) data;
buf.append(theChar);
data = d.read();
}
d.close();

}catch(Exception ex) {
System.out.println("HttpUrlDerive.readFile() ex:" + ex);
}

System.out.println("HttpUrlDerive.readFile() buf.length:" + buf.length());

}