Tcl Source Code

View Ticket
Login
Ticket UUID: 1358369
Title: geturl http://192.168.0.2/test/test.php?userid=sip:[email protected]
Type: Bug Version: obsolete: 8.4.11
Submitter: xiaotaow Created on: 2005-11-16 19:35:29
Subsystem: 29. http Package Assigned To: dkf
Priority: 5 Medium Severity:
Status: Closed Last Modified: 2005-11-18 22:21:38
Resolution: Fixed Closed By: dkf
    Closed on: 2005-11-18 14:01:13
Description:
http::geturl
http://192.168.0.2/test/test.php?userid=sip:[email protected]

will report an error. 

The regular expression 
set exp
{^(([^:]*)://)?([^@]+@)?([^/:]+)(:([0-9]+))?(/.*)?$}

will get confused by the '@' in the URL.

-Xiaotao
User Comments: dkf added on 2005-11-18 22:21:38:
Logged In: YES 
user_id=79902

backported after prodding from dgp

dkf added on 2005-11-18 21:01:15:

File Added - 156758: httpparse.diff

dkf added on 2005-11-18 21:01:12:
Logged In: YES 
user_id=79902

Added improved validator (I think it is compliant with RFC
3986, which I believe to be the most up-to-date URI spec) to
HEAD using attached patch.

I don't plan to backport this patch.

dkf added on 2005-11-18 07:07:23:
Logged In: YES 
user_id=79902

Reopening to stop me from losing the issue...

hobbs added on 2005-11-18 01:30:02:
Logged In: YES 
user_id=72656

Donal - you can reopen if you want to provide a better RE,
but make sure to add more tests along with it.  Regardless,
better conformance to the x-url-encoding for users should be
directed.

dkf added on 2005-11-17 18:22:57:
Logged In: YES 
user_id=79902

On the other hand, handling URLs more robustly is a good
thing anyway. Tests indicate that the RE:
{(?x)
   ^
   (?: (\w+) : )# <protocol>
   (?: //
      (?: ([^@/:#?]+) (?: : ([^@/#?]+) )? @ )?# <user> <pass>
      ( [^/:#?]+ )# <host>
      (?: : (\d+) )?# <port>
   )?
   ( / [^#]* )?# <path> (including query)
   (?: \# (.*) )?# <fragment>
   $
}

offers both robust parsing (including of a number of rarely
used parts of the URL spec) and yet is also fairly easy to
understand. It also doesn't capture bits that aren't
interesting.

hobbs added on 2005-11-17 02:41:11:
Logged In: YES 
user_id=72656

That is not a valid geturl request.  The URL is not strictly
valid - it should be encoded, which http provides the
functions for.  The query part should be formed by:

(Tcl) 50 % http::formatQuery userid sip:[email protected]
userid=sip%3ax%40abc.com

that is proper URL formatting.

Attachments: