TODO (for Gecko 1.9)
- Introduce nsBaseChannel from which other channels may subclass. This will help us consolidate duplicated code that is currently spread across each channel implementation and avoid inconsistencies in the process. It should also help reduce codesize a bit. See bug 312760.
- Support IRIs better. This includes supporting %-escape sequences in hostnames. It also includes encoding non-ASCII URL parts to UTF-8, or more likely doing what other browsers do (which entails leaving the query parameters encoded in the origin charset). See bug 309671.
- Make it easier to use Necko from JS. Make use of nsIClassInfo to automatically reflect interfaces. Make it easier to work with streams from JS. nsIScriptableInputStream needs to be replaced with something better, for example.
- Improve support for synchronous HTTP requests. This likely translates to an overhaul of the event queue system.
- Given that HTTP is largely a stream pump on top of a pipe, shouldn't this be much easier to do, by just exposing the input end more directly?
- Yes, we could do it that way. However, there are issues to figure out such as whether or not blocking on a synchronous stream should block the UI from processing events.
- Support unicode file paths. This impacts the way we encode file URLs. Currently, we encode them in the "native filesystem charset" and then %-escape any non-ASCII bytes. This is not going to work if we wish to support full unicode file paths because the conversion from unicode to "native filesystem charset" may be lossy. See bug 162361 and bug 278161.
- Optimize nsIURI construction (nsStandardURL::SetSpec). We may be able to shave off some cycles here by ensuring that the input string is assigned directly to mSpec when it is determined that no canonicalization is required. This will in many cases allow us to simply share the given string buffer instead of having to copy it.
- Improve disk cache: increase its default size, eliminate eviction-on-hash-collision (?), delay writing to disk when data is less than 16k, account for actual disk usage better.
Comments from Alfred (2005/Oct/15)
- Increase supported datasize (files bigger than 64M are wrongly accounted, and make eviction confused) (in nsDiskCacheRecord, the max.size for metadata can easily be reduced (as the current max'es on 16K for metadata), providing more bits to use for max. datasize.
- Image 'caching': because 'active' images are never evicted, the total size of the image cache can be much bigger than the 'hard limit'. May be find a way to forcefully evict the decoded image data from the cache, while the original encoded format is still available (generally)?
- Merge nsDiskCacheDevice and nsDiskCacheMap.
- In nsDiskCacheEvictor::VisitRecord, prevent the malloc's for reading the complete 'DiskCacheEntry' and for the key copying just to check for clientID, instead make a custom version of ReadDiskCacheEntry to just check for clientID in the stored entry on disk.
- Implement asynchronous or delayed writing of cache files (for BlockFiles as well as normal files) so not to block the page loading because of caching (bug 197431).
- Disk cache size accounting: make the 'total count' the total size of all the disk files (the map, blockfiles and the actual data files), to enable really maximizing the total disk cache size.
Comments from Andreas (2005/Oct/18)
- Move the URLParser to RFC 3986 compliance, getting rid of the param component in the process (params can be part of any path segment, not just the last one).
- Better URI-fixup-code (for example the \ versus /-handling)
Comments from Boris (2005/Oct/24)
I was talking to dbaron about IRIs the other day... Some of the modern specifications (eg SVG) explicitly state that their URIs are text representations of IRIs. Which means that we're going to have two distinct types of URIs floating around:
- Those just specified to be URIs (eg HTML attributes). For these, we have to pick the encoding to use when sending to server; as you noted in the TODO list, this is likely to mean UTF8 for all but the query and page encoding for the query.
- URIs specified to be IRIs (SVG, etc). For these, I think we want to use UTF8 throughout, per the IRI spec.
The question then becomes how to tell apart the two internally. Especially given that nsStandardURL will so helpfully use the base URI's origin charset, so just passing null for the charset for things that should be IRIs is not an option. :(
This is something I think we should resolve in the 1.9 timeframe, esp. given that you want to work on IRIs anyway. I can file a bug on this issue if you want.
Have we ever looked into how our cache locking affects performance, if at all? Does HTTP spend significant amounts of time waiting on the cache locks? That is, is there significant multi-thread access to the cache? Or do we generally access it from a very few threads?
We only read the cache streams on background threads. Everything else happens on the main thread exclusively. --darin