Thursday 30 December 2004

Asia Earthquake/Tsunami Aid...

"The Australian public can help these aid agencies reach affected people by giving much needed cash donations to purchase and deliver urgently needed supplies. In kind donations of clothing, toys, blankets and food are NOT being collected as it is expensive to transport these goods overseas and they can be purchased more cost effectively either locally or regionally."

ACFID - Asia Earthquake

How to Help

Thank you

Monday 22 November 2004

Australiana

This is a pretty cool site for travellers and Australians...
Know Your Australia
lots information including things you might not expect - such as language, food, humour and also The Man From Snowy River and other poems by Banjo Paterson.

OK, a bit off-topic...

Tuesday 26 October 2004

Page Inheritance In ASP.NET

Page Inheritance In ASP.NET is a favourite topic of mine, and although I've implemented a similar model to this in the past, I like how Jon's written it up.

Sunday 24 October 2004

C# "using" Tricks

Interesting idea from whatever: Stupid "using" Tricks -- scroll down to WriteEndElement and the discussion on writing xmlWriter...

Xml Sucks...

Well, not really - but some people obviously think XML has some serious deficiencies.

Came across that link - a long read but interesting points-of-view - while searching for info on the iTunes (rocks!) Music Library XML file. I want to move info between two installs of iTunes (song ratings, # plays, etc) and thought hacking the XML file would be fairly easy. However, the XML file itself it weird and not what you'd expect from a semantically-rich XML syntax.
<plist version="1.0">
<dict>
<key>Major Version</key><integer>1</integer>
<key>Minor Version</key><integer>1</integer>
<key>Application Version</key><string>4.6</string>
<key>Music Folder</key><string>file://localhost/F:/My%20iTunes/</string>
<key>Library Persistent ID</key><string>037D341EA9748F0D</string>
<key>Tracks</key>
<dict>
<key>136</key>
<dict>
<key>Track ID</key><integer>136</integer>
<key>Name</key><string>Evolution (Intro)</string>
<key>Artist</key><string>Bliss n Esso</string>
<key>Album</key><string>Flowers In The Pavement</string>
<key>Genre</key><string>Hip-Hop</string>
etc...

My first impression was that it was a pretty crappy XML implementation... so I jumped onto Google to see what others thought.

This article Playlist to XML had an interesting comment by Bob Ippolito:

>>"The plist format is nasty, whoever designed it really didn't know anything about XML."
>That's just totally wrong. XML Property Lists are a very unambiguous and simple serialization format. They're designed to be extremely simple and fast to parse (no attributes, etc.) because they're ubiquitous in OS X.
So maybe I was being too harsh on the format -- never jump to conclusions or make assumptions about technology! Try to anticipate the creator's goals rather than dumping on something because it doesn't fulfil yours. Here's Apple's doco fyi.

Anyway, the same blog has some useful code for transforming the iTunes format : Cleaning up iTunes plist XML. It's not quite what I want... but will get to that another time.

Tuesday 19 October 2004

On NHibernate...

Object-relational modelling is an interesting topic - but it just seems so different/foreign that it's hard to know where to start.

It's helpful to read positive comments like these article: NHibernate and NHibernate Part Two, but there are soe many other options, like NPersist and many others...

Paul Wilson has a lot to say on Examples of O/R Mapping vs Stored Procedures

Anyway, eventually we'll get around to implementing something... somewhere...

SOT: The REAL Reason Behind the ObjectSpaces Furor

Open Lucene.NET

Wow - I didn't know that Lucene.net "disappeared" from SourceForge (wonder if selling it will work?)... to be recently replaced by Open Lucene.NET - The Open Source Search Engine.

Thanks Scott.

Tuesday 12 October 2004

VB.NET and C# syntatic diffs

Although I will always use C# by choice, I'm currently using the (better than nothing) tool ANTS.Load which uses VisualBasic for Applications 'automation'...

So, what's a C# programmer to do? Use this handy
VB.NET and C# Comparison

BUT what is VB for the C# @"literal string" ????

Sunday 10 October 2004

"Unable to Start Debugging" with VisualStudio web project

I much prefer Asp.net without web projects but setting it up on a new PC resulted in the dreaded "Unable to Start Debugging" message when I hit F5.

This MSDN article PRB: "Unable to Start Debugging" is NOT the problem in many cases.

It could also be that the site/virtual directory you are attempting to debug is using a different version of the framework than your app (eg. 1.0, 1.1 or 2.0) and requires you to run ASPNET_REGIIS.exe

Or it might be IIS authentication settings...

BUT for me, today, the problem was that my site (using 'Local', not 'Web Project' settings) did not have a web.config file. Simply adding a web.config file fixed the problem (all the above settings already being correct). A 'Web Project' would automatically have a web.config file included, but creating and converting a 'Class Library' to a website did not... Oops!

Monday 6 September 2004

'Architecture' articles

OK, so I got flamed a bit for sending around this article - .NET Architecture Center: : Secrets of Great Architects. Yes, you have to (try and) ignore the gung-ho Microsoft-speak about "being a *great* architect" but the fundamentals are all there.

Possibly this is a better article Realizing a Service-Oriented Architecture with .NET - it's shorter, easier to read and the pictures make more sense. The scope is a litter narrower, but still a useful read for anyone who thinks 'architecture' is only for buildings.

This info on Developing Identity-Aware ASP.NET Applications is useful and "also provides detailed prescriptive guidance for implementing intranet and extranet ASP.NET applications that are integrated with Active Directory."

Wednesday 1 September 2004

Localization Resources with ASP.NET 2.0

Two CodeProject articles on Creating multilingual websites are great sources of information, and actually mirror the way I've also approached the problem with my own projects.

But something I hadn't read much about (yet) is Using Resources for Localization with ASP.NET 2.0 (Fredrik Normén's) which was linked from CodeProject. The approach in ASP.NET 2.0 still seems a bit clumsy to me - still no "built in" way to have site-wide localization management, nor a simple cross-over between localization of static elements (such as labels, text, html elements) and database-resident information (product names, news items, etc). You could argue that there are a multitude of possibilities and that Microsoft can't address all possible permutations -- but why not at least offer a basic solution, like they do for Authentication, Personalization, MasterPages, etc???

Monday 16 August 2004

About the Provider Model...

I'm planning to use the Provider pattern to abstract implementation details in my search-engine project (Searcharoo), so I'm always interested to read about it. Another search project - Nata1 has implemented the pattern, although I haven't looked at what they've done.

Provider Model Misconceptions is a useful resource that discusses some of the concepts.

Oh, don't forget Part 2 of the MS pattern article.

Wednesday 11 August 2004

IIS Cache-control: private

Interesting question today: IIS sets a default
cache-control: private header for ASP and ASPX requests which Microsoft justifies by saying that "dynamically generated pages aren't normally expected to be cached"... but where is this value _set_ in IIS and can it be easily turned off without asp/x code or ADSI script???

This MSDN article on CustomControlCache in the metabase explains how to set CacheControlCustom = "no-cache" but I cannot find it 'pre-set' anywhere in the metabase as a default value.

This sounds like it will work for .swf and other custom MIME types that you might want to set specific cache policies for, but it's very dodgy; appending a CRLF and 'fake' cache-control header to the content-type sent by IIS... surely this would 'break' if a cache-control: header was also sent another way (such as 'cachecontrolcustom' above) - would result in two possibly conflicting headers...

There's also this information Microsoft IIS 5.0: IIS Optimization and the Metabase but it's not that useful either...

Wednesday 4 August 2004

Using XML in Localization

I will never understand why a company that owns translate.com calls itself "Enlaso"... but they have some interesting content on their website, including
Using XML in Localization.

The Rainbow tools might also be useful, offering simple encoding conversions, line-break conversion, RTF processing and other utilities... although I haven't tried them out myself (yet)

While we're talking about XML and L10n, Xliff is my favourite standard right now.

Tuesday 3 August 2004

c# Enums

Enums are a very useful way to improve code readability, speed-up coding (think intellisense autocomplete) and enforce business rules... but there are also niggly little issues like having to figure out the Enum.Parse method and how to AND/OR Enums together using the [Flags] attribute.

Thankfully, here's a very useful post of links to all sorts of Enum-related info
Enums + Attributes = Swiss Army Knife

And a very useful sample lives here Associating string values to items in code with code.

Decorating Enums with string attributes sounds like it could be used to solve two ongoing problems:
  • localizing(translating) the meaning of an Enum for display to a user, maybe using some sort of 'translation key' attribute; and
  • linking the Enum type to some underlying database lookup table to which it is related...

Thursday 29 July 2004

String.Format("{0}", "formatting string"};

I can never remember all the options for string formatting, particularly the ability to 'pad' strings - in the past I've written padding methods in C# because I didn't know about the format specifier!

idunno.org: String.Format("{0}", "formatting string"}; has a good reference for the most common format specifiers...

idunno.org also has a good reference for accessing IndexServer from .NET.

Sunday 18 July 2004

ASP.NET HtmlControls vs WebControls

One of my 'standard' questions when interviewing for ASP.NET Developer roles is What is the difference between HtmlControls and WebControls?
I have not yet received a really good answer from any candidate, and I've been using this question for over 18 months in a number of interviews... I've just finished another round of interviews and haven't found much understanding of this stuff!

These two articles go some way towards explaining the issues:
What I want to hear is:

  • With HtmlControls, you *know* what the output will look like at design time (basically what you've typed, with the runat="server" gone)

  • The HtmlControl.HtmlGenericControl is the *only* way to accomplish some things neatly like manipulating the TITLE, META and other tags

  • WebControls have a 'consistent' object model (ie. they all have a .Text property, whereas HtmlControls might require using .Value or .InnerHtml)

  • WebControls *may* be rendered differently depending on their attributes OR the BrowserCaps of the browser. Eg. an asp:TextBox might be rendered as an INPUT or TEXTAREA if TextMode="mulitline"; the Date Picker will be drastically different on up- and down-level browsers


Things I don't want to hear are: HtmlControls don't run on the server, HtmlControls don't have ViewState support, HtmlControls are somehow less useful, confusing HtmlControls with non-runat=server Html tags in the source.

If you've done any Mobile development then it might help if you know how the BroswerCaps works to render cHTML/WML/etc on different devices...

Bonus points for: Mentioning the INPUT TYPE=FILE HtmlControl and how it works when run as an HtmlControl, mentioning the asp:Literal control and how all 'static' text between Server Controls on a page becomes an instance of the Literal control during page parsing, mentioning the more complex WebControls such as Date Picker and the Validation controls.

You're hired if: you can explain what happens when WebControls are rendered on up- and down-level browsers; you can explain the barriers to the WebControls being rendered as up-level on non-IE browsers, the issues with the Validation controls in thisi scenario and how to fix them; and you know why these two namespaces are significant for control rendering System.Web.HttpBrowserCapabilities and System.Web.Mobile.MobileCapabilities...

I wonder how many candidates 'Google' their prospective employers/interviewers...

UPDATE: words to avoid in your resume

Saturday 17 July 2004

"Universal" or "Neutral" Spanish Translations

There's always something new to learn about software localization - for instance
Microsoft's approach to creating "Universal" or "Neutral" Spanish Translations.

I suppose if you asked the 'person on the street' they might think there _is_ only one Spanish (or French, for example) language - at the other extreme if you asked a Mexican, Bolivian (or New Caledonian/French Canadian) they would probably insist their native tongue is quite distinct from that used in Spain (or France).

When faced with localization in these languages, the two obvious solutions might be:
* use the 'original' language and expect it to work in all other countries; or
* localize into each and every 'dialect' of the language to ensure you do not alienate customers in each country.

The (think outside the square) solution is, of course, to merge these two ideas -- purposefully structure the language you use so that it appears natural to ALL speakers. That's actually a lot harder than it sounds, but reducing the number of software versions you have to ship from ten to one (eg. in Spanish) certainly appears to make it worthwhile. This i18nguy article about Microsoft discusses the concept of “Universal” or “Neutral” Spanish and particularly Latin American Spanish (es-americas).

Thursday 15 July 2004

MS: Hazards of Hiring

The Hazards of Hiring is the last topic I expect to read about on MSDN -- but relevant since I'm hiring at the moment.

The other article I found useful was the Guerrilla Guide to Interviewing, which I may have blogged previously (but it's worth another look).

UPDATE [7-Sep] WHY is it that candidates no longer feel any compulsion to do any research on a company before attending an interview. My first question is always "What have you learned about our company?" or "Did you review our website?" -- the answers to which are inevitably "No..." Why waste your time (and mine) by turning up to an interview where you don't even understand what our company does? This laziness is so common that it's ceased to be a useful measure of a candidate's interest in a position, because NONE of them bother to do it!

Wednesday 7 July 2004

Character-set/encoding detection using C# - NCharDet

NCharDet is a C# port of the Java JCharDet, which is based on C++ code used in the Mozilla open-source browser project.

I don't yet fully understand the implications of the Netscape Public Licence so the code is just posted on my website for now.

I was disappointed that I couldn't find info on this already available on the web - particularly that no-one seems to have got MLang.dll running under COM Interop with C#/.NET...

Saturday 3 July 2004

Populating a Search Engine with a C# Spider on [The Code Project]

My two Searcharoo search engine articles are now also available on The Code Project

as well as on my personal website... a friend suggested they might get better coverage there - and so far a couple of very useful comments, so it was a good idea!


ALSO I just came across an excellent article about Integrating Unit Testing into the Software Development Lifecycle - check it out.

AND FINALLY, this is also a great post/book review of Debugging the Development Process by Steve Maguire. Even if you don't read the book - READ THE POST.

Wednesday 30 June 2004

VS.NET 2005 : Why is MS Hard Coding?

I've Seen the Future and It's Hard Coded talks about the new /Data, /Code and /Themes folders in ASP.NET 2.0

I hadn't really thought that much about it, but it is unfortunate that MS hasn't made the foldernames 'more unique' or configurable. If an ASP.NET 1.1 website uses folders with those names, you may have to update your data/links/etc... although I wonder if it wouldn't be too hard to write an HttpHandler to intercept requests for "your" /Code path and process it using HttpContext.Current.RewritePath and an alternate directory-name like /MyCode

Hmmm - planning to download Express soon, so I might give that a try...

Word stemming for Search

Version 2 of Searcharoo is now online (although still in final draft form) and I'm starting to think about the version 3.

The first priority is building a database and/or file-based persistance layer, but other more advanced search-type technology is also on my mind, including STEMMING :

What is Stemming?

Paice/Husk stemmer modifications by Antonio Zamora

Limitations of Stemming

and for Japanese text (which I'm also interested in)...

A WWW JAPANESE DICTIONARY and article 2

and

The Challenges of Intelligent Japanese Searching

Saturday 26 June 2004

JoelOnSoftware gets flamed

SecretGeek writes How Microsoft Lost the Joel War and it's pretty good too.

TODO Driven Development beats eXtreme Programming and TestDriven Development hands-down :-)

Friday 25 June 2004

Open-source .NET search engine : Nata1

The Homepage of Nata1 is now 'live' and contains an article about implementing a .NET search engine using either custom data structures, Index Server or Google.

The search input and results are highly customizable using templated controls that have some VS designer support, and you can use Google on your site without having to deal with their API.

Interesting 'search' project to watch...

Thursday 24 June 2004

Writing Language-Portable Transact-SQL...

Writing Language-Portable Transact-SQL... and no it doesn't mean C#, VB or JScript - this article actually discusses some of the more esoteric issues relating to SQL Server and multilingual data (you know, languages other than English) including interpreting dates, Unicode fields and 'collation'.

Useful if you are unsure about how to approach these issues in SQL - although it does make it sound more complex than it really is (for simple applications)... Some date handling should just be common sense by now!

Wednesday 23 June 2004

CassiniEx Web Server

I think Whidbey (VS.NET05) has a version of Cassini built-in as the 'development server' -- but this looks pretty cool and is available "now"...

CassiniEx Web Server is an enhanced open-source version of Microsoft's cut-down C# web server. I currently start 2 or 3 Cassini sessions (port 8081, 8082, 8083...) on my XP Home machine when I'm developing/debugging with #develop and WebMatrix. Hoping that CassiniEx will make my XP Home development environment a whole lot easier to manage... it's downloading now...

OT: Parsing CSV files using the ODBC Text driver could come in really handy - although I'm not quite sure for what, yet.

Sunday 20 June 2004

Unicode and multilingual support in HTML...

Don't know how I've not come across Unicode and multilingual support in HTML, fonts, Web browsers and other applications before.

Definitely a great addition to my Localization and Globalization info.

OT: Found my first commercial use for ExtendedHtmlUtility.HtmlEncode() today: a client's website is hosted on an ISP's Apache Server - configured to ALWAYS set the HTTP Header Content-Type: Shift_JIS. This was making it impossible to serve Korean and Chinese pages from this server, since W3C says
To sum up, conforming user agents must observe the following priorities when determining a document's character encoding (from highest priority to lowest):
  1. An HTTP "charset" parameter in a "Content-Type" field.
  2. A META declaration with "http-equiv" set to "Content-Type" and a value set for "charset".
  3. The charset attribute set on an element that designates an external resource.

Which means the browse (IE, Firefox, Netscape, etc.) will ALWAYS think the page is Shift_JIS (Japanese) and not display Korean or Chinese text correctly!

By converting ALL the non-ASCII (well, all non-Shift-JIS actually) characters into Html Entities (eg. &#1234;) the page will be successfully displayed in Korean or Chinese with the encoding set to Shift_JIS (because [ & # 1-9 ; ] are all valid Shift_JIS characters, and once they're resolved into their Unicode characters, the browser is happy to display them using whatever font-settings (or mappings) it knows about, regardless of the actual page encoding!.

It's not ideal, but at least it works - even in Netscape 4.7 (as long as you have specified the correct fonts, because we all know how dumb NS4 is at font substitution). I suspect if the pages had any 'text' within Javascript strings/variables/etc that would have caused a problem... Luckily not (this time).

Wednesday 9 June 2004

UrlEncode vs. HtmlEncode

In an earlier blog about Html Entities and Unicode I touched on the HtmlEncode() method.

This page UrlEncode vs. HtmlEncode contains the complete framework HtmlEncode() method using Reflector. It clarifies exactly what to expect from the method, and confirms that HtmlEncode() only converts a subset of Unicode characters into Html Entities (eg. &#1234;).

OT: I really like the look of this site, too.

Visual Source Safe vs Visual Studio Team System

It seems like most Microsoft products have gone from version 3 to version 11 in the same timeframe as VSS has stumbled from 6.0a to 6.0f...

So it's no surprise that there has been a lot of excitement about the news of a NEW version of VSS; and also of some cool 'Team' features in VS.NET2005... Imagine the confusion, then, as it becomes clear these are not the same product!

Visual Studio Team System is a good place to start reading about VSTS -- but not necessarily VSS. The MSDN Team System page has a good overview.

I like this quote "So VSS is to Hatteras[Team System] as Access is to SQL Server" (from somewhere here I think) except VSS doesn't have a cut-down, database driven back-end that easily migrates/scales to VSTS, like Access → SQL (??).

The Microsoft Visual SourceSafe Roadmap explains a bit more about SourceSafe 2005 -- basically sounds like the version upgrade we wished for around 2000... hard to see it lasting to the post-Whidbey VisualStudio release (i.e. beyond 2006).

Sunday 6 June 2004

User input - who puts comma's in a number, anyway?

My users keep finding little issues with out 'Site Admin' pages - they want to do crazy things like type large numbers with 'thousands' seperators and/or decimail points, ie. 12,500 and 3,500.99

Of course, in my testing I only type 1233465 or 99999 so I never see the Exceptions thrown by their "bad" input.

This page is a handy reference for all things format-related, including both ToString() output as well as parsing inputs : Inside C#, Second Edition: String Handling and Regular Expressions Part 1.

The answer to my problem is here too - Int32.Parse(string, NumberStyles) with the System.Globalization.NumberStyles enum to pick and choose what inputs are acceptable. You might still want a try/catch, or some Regex Validation on the client side, just to make sure...

Thursday 3 June 2004

.NET tools

Came across two useful-sounding tools today

QuickCode.NET for speeding up creation of code constructs within VS.NET (eg. type "prop int test" and it will expand to a complete Property get/set definition)... and you can create your own shortcuts+templates.

log4net looks waaay better then the dodgy Trace.WriteLine/TraceListener setup I'm currently using... being able to set hierarchical log targets across a number of different 'formats' (EventLog, file, email, etc).

In the process of installing them both now...

Wednesday 2 June 2004

CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart)

You're not alone if you haven't heard that CAPTCHA acronym before - neither had I.

However I have come across them many times - Network Solutions requires you to pass a CAPTCHA test before every "whois" query, and many other sites (particularly 'free' services) are not asking new users to 'verify' themselves via CAPTCHA...

So what is it? A combination image & form input which requires you to 'decipher' some distorted/disguised/obsfucated text from the image and type it into the browser. It's a LOT easier for people to read these images than it is for automated bots to OCR or otherwise guess them - thus preventing automated/fraudulent use of your website/resources!

Interesting, eh? This article 15 Seconds : Fighting Spambots with .NET and AI is a great intro, and includes some interesting examples and Visual Basic.NET sample code. See the CAPTCHA references too.

Tuesday 1 June 2004

Searcharoo article 'online'

Way cool - my how to build a c# search engine is article of the day on ASP.NET Web: The Official Microsoft ASP.NET Site : Home Page.

Neat!

Automating Excel with C# II - unexpected dialogs

Opening files from C# occasionally results in the app (Excel, for example) opening an unexpected dialog box (such as "Name Conflict - Name cannot be the same as a built-in name. Old name: Print_Area New Name: ________".)
AFAICT there's no parameter in the Application.Workbooks.Open() method that I can use to auto-dismiss this dialog, so I need to address it some other way.

A simple answer appears to be the
SendKeys Class
which I 'discovered' via this MSDN article: HOWTO: Dismiss a Dialog Box Displayed by an Office Application with Visual Basic.

I haven't actually implemented it yet (and even if i do send an 'ESCAPE' sequence to cancel the dialog, I still won't be able to open the file. BUT at least it won't hang/wait forever for user input that is never coming. Plus, (hopefully) sendint 'ESCAPE' will also deal with some other (as yet unknown) dialogs that might appear...

Sunday 16 May 2004

A Blog with Brains? Joel on Software

I infrequently find myself on Joel Spolsky's site as the result of a search for something or other. It's not really a blog but a collection of articles on many different (it/web/developer) topics. The stuff he writes is ALWAYS interesting.

With that in mind, I finally added his Article Archive to the list of sites I visit regularly, just to see what's there.

I don't know how I missed this article previously, given it's relevance to me and my work, but The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) is my 'read of the week'.

Saturday 15 May 2004

More on... Client-Side Validation in Downlevel Browsers

This article ASP.NET.4GuysFromRolla.com: Client-Side Validation in Downlevel Browsers summarises the issues with validation in 'downlevel' browsers [although calling them 'downlevel' is Microsoft techmarketing at it's best]. Why not call them 'more standards compliant than IE' browsers instead?

More and more clients will be using non-IE browsers (Safari, Firefox, Mozilla) and non-PC-devices in future, so it makes sense to try and provide the same user-experience (ie. faster, client-side validation).

Go to the article to finds links to standards-DOM-based validation controls; or the longhorn beta documentation, which contains simplified versions of the validator controls that ship with the .NET Framework. Unlike the validator controls in the SDK, which work only with Internet Explorer, these controls comply with the World Wide Web Consortium Document Object Model Level 1 specification (W3C DOM Level 1) and support a number of browsers such as Internet Explorer 5, Netscape Navigator 6, and Opera 5.

Thursday 13 May 2004

Automating Word with C#

The Microsoft Word MVP FAQ is a useful reference, however it's annoying that so much Office Automation code on the web is still in VB/VB.NET. It appears very few people are using C# for automation projects (so far).

Also, there is no more frustrating phrase in programming than finding this: Microsoft has confirmed that this is a problem in the Microsoft products that are listed at the beginning of this article on MSDN in relation to a problem you are having, like automating OrgCharts in Excel. Argh!

Wednesday 12 May 2004

HttpHandlers and HttpModules

Both these articles are very useful if you want to understand the handler/module concepts, and get some sample code.

You'll want to know about this if you are doing URL-rewriting, sophistication accept-language-detection or other early-in-the-pipeline processing.

Monday 3 May 2004

Unicode to Html Entity and back again

WorldPay with C#/.NET [2] also discusses encoding "double byte" (unicode, really) values as Html Entities. The 'final' code is
private string HtmlEntityEncode (string unicodeText) {

int unicodeVal;
string encoded="";
foreach (char c in unicodeText) {
unicodeVal = c;
if ((c >= 49) && (c <= 122)) {
// in 'ascii' range x30 to x7a which is 0-9A-Za-z plus some punctuation
encoded += c; // leave as-is
} else { // outside 'ascii' range - encode
encoded += string.Concat("&#",
unicodeVal.ToString(System.Globalization.NumberFormatInfo.InvariantInfo), ";");
}
}
return encoded;
}


But it's also fairly easy to get your 'original' string back... this code can go anywhere
System.Text.RegularExpressions.Regex entityResolver = 

new System.Text.RegularExpressions.Regex (@"([&][#](?'unicode'\d+);)|([&](?'html'\w+);)");
string outputString = entityResolver.Replace(inputString,
new System.Text.RegularExpressions.MatchEvaluator (ResolveEntity) );

as long as this method is available
private string ResolveEntity (System.Text.RegularExpressions.Match matchToProcess) {

string x = "X"; // default 'char placeholder' if cannot be resolved
if (matchToProcess.Groups["unicode"].Success) {
x = Convert.ToChar(Convert.ToInt32(matchToProcess.Groups["unicode"].Value) ).ToString();
} else {
if (matchToProcess.Groups["html"].Success) {
switch (matchToProcess.Groups["html"].Value.ToLower()) {
// this could be expanded to as many as you like, or (maybe)
// System.Web.HttpUtility.HtmlDecode will work on
// the whole 'entity' string... ?
case "nbsp": x = " ";break;
case "copy": x = "(c)";break;
case "lt": x = "<";break;
case "gt":x = ">";break;
case "amp": x = "&";break;
}
}
}
return x;
}


UPDATE: list of HTML 4 entities which could be used to write a robust ResolveEntity() method using the 'pattern' abpve/

'Securing' data on client round-trips

When my website users go 'off-site' I want to set some information that I can check later, with some confidence that they have not altered it.

OK, I can put it in a Session variable, but I don't have Sessions enabled. What I want to do is encrypt a little chunk of data which can be added to the ViewState, hidden FORM field or Cookie (ie. it will be "client/transport implementation independent"), and check it again later on...

I think these two articles: String Encryption With Visual Basic .NET and Building Secure ASP.NET Applications: Authentication, Authorization, and Secure Communication might be what I'm looking for... perhaps using Base64 if there's binary data being generated.

If I get it working I'll post it here (as always)

Tuesday 27 April 2004

DHTML JavaScript Tooltips

I use ToolTips a lot in my web apps, not just for 'help' information but also hovering over rows in a table to present additional information rather than having super-wide tables/lots of columns.

This DHTML JavaScript for Cross Browser Tooltips is the coolest implementation I've come across...

Thursday 22 April 2004

ASP.NET Browser Detection - uplevel FireFox?

I'm so impressed with the latest (0.8 last I checked) release of FireFox that I've started using it to test my ASP.NET apps.

Therefore I was pleasantly surprised to find someone(Rob)'s already figured out
Browser Testing and Detection Resources to encourage ASP.NET to play nicely with the latest IE-equivalent browsers...

Thanks to Mitch's blog on Web Controls and Browsercaps for pointing me there.

Wednesday 14 April 2004

C# and Excel Automation : Google Groups to the rescue

I don't post to newsgroups that often (although I try to contribute as much as I can to asp.net), but whenever I do I'm always amazed (and thankful) at the speed and quality of responses.

Overnight my problem with Excel automation was solved
Google Groups: Looping in Excel XP with C# -- Ranges and SpecialCells - reducing processing time from 90 minutes to 3 minutes!

Here is the 'working' code; loops through the Areas returned by the SpecialCells method so we only process non-empty cells...

Excel.Range range = sheet.UsedRange;

Excel.Range newrange = range.SpecialCells(Microsoft.Office.Interop.Excel.XlCellType.xlCellTypeConstants, (object)3 );
for (int areaid = 1; areaid <= newrange.Areas.Count; areaid++){
Excel.Range arearange = newrange.Areas.get_Item(areaid);
for (int row = 1; row <= arearange.Rows.Count; row++){
for (int col = 1; col <= arearange.Columns.Count; col++){
cell = (Excel.Range)arearange.Cells[row, col];
// do stuff with cell.Value2
}
}
}
FYI, the SpecialCells (object)3 parameter is the sum of relevant 'constants' below:
XlSpecialCellsValue 

xlErrors 16
xlLogical 4
xlNumbers 1
xlTextValues 2

Tuesday 6 April 2004

Automating Excel with C#

After much hair-pulling because I could not make Excel "work" as it should, I came across this article 328347 - PRB: "Member Not Found" Error Message When You Use a For Each Statement on an Excel Collection with Visual Basic .NET or Visual C# .NET

So that's why I couldn't loop through cells!

This seems to work better:
if (m_Excel.Workbooks.Count >= 1) {

Excel.Sheets sheets = book.Worksheets;
Excel.Range range,cell;
// foreach (Excel.Range cell in range.Cells ) BROKE http://support.microsoft.com/?kbid=328347
foreach (Excel.Worksheet sheet in sheets) {
range = sheet.UsedRange;
for (int row = 1; row <= range.Rows.Count; row++){
for (int col = 1; col <= range.Columns.Count; col++){
cell = (Excel.Range)range.Cells[row, col];
if (null != cell) {
if (null != cell.Value2) {
cellContent = cell.Value2.ToString();
// do something with the text...
}
}
}
}
}
}


Incidentally, I still don't know why ".Value2" ...

Monday 5 April 2004

File Extension filtering with .NET Regular Expressions

The 3 Leaf: .NET Regular Expression Repository provided a great starting point for writing a file extension filter to prevent certain file-types being uploaded in an ASP.NET HtmlInputFile control, using the RegularExpressionValidator.

Ultimately I came up with this
^([a-zA-Z]\:|\\)\\([^\\]+\\)*[^\/:*?<;>;|]+(\.txt|\.doc|\.xls|\.ppt|\.pdf|\.htm|\.html|\.zip)$
which seems to work OK, HOWEVER it's case-sensitive (so myFile.DoC won't upload).

If you want to only do validation on the server-side, a simple modification will enable 'case-insensitivity'
(?i)^([a-zA-Z]\:|\\)\\([^\\]+\\)*[^\/:*?<;>;|]+(\.txt|\.doc|\.xls|\.ppt|\.pdf|\.htm|\.html|\.zip)$

To really make it nice I wanted client-side validation as well, which meant building a custom validation control inheriting from BaseValidator. Most of the documentation I found was USELESS, but amazingly the 'beta' Longhorn documentation on Client-Side Functionality in a Server Control, Validator Control Samples and Client-Side Functionality in a Server Control
provided all the information I needed on the otherwise cryptic uses of RegisterValidatorCommonScript, RegisterValidatorDeclaration and the other bits and pieces required to implement a client- and server-side validator.

The result, InsensitiveRegularExpressionValidator is still in testing but will be posted soon.

Monday 29 March 2004

Microsoft Word Automate WordCount frustration

It's amazing how difficult it can be to find information about MS Word when writing VBA (or automation using C#) - I needed to find the wordcount and charactercount programmatically, but the "document properties" m_Word.ActiveDocument.Words.Count and m_Word.ActiveDocument.Characters.Count don't seem to relate to ANYTHING!?!?!.

Thankfully this page myITforum.com : Document Word Counter gave me a push in the right direction:

object missing = System.Reflection.Missing.Value;
m_Word.ActiveDocument.ComputeStatistics(Microsoft.Office.Interop.Word.WdStatistic.wdStatisticWords, ref missing);

Why have Words and Characters properties if they don't DO anything???

P.S. Originally thought DSOFile would be able to do this job - but the Word and Char Count stored in the document properties in the filesystem isn't always up-to-date or meaningful either - it seems like ComputeStatistics is the only reliable way...

Wednesday 24 March 2004

Parsing html markup text using MSHTML

Came across this article today; wondering whether this is a good solution for improving Searcharoo.net - both it's ability to "spider" the web by finding links in Html, and also parsing the Html into words for indexing (eg. pulling out the META tags, etc)... Parsing html markup text using MSHTML By Hendrik Swanepoel

I really want something lightweight that will help parse Html (a) links for spidering and (b) words for indexing... Other than some complex Regex, MSHTML is the only other option I've come across...

Sunday 21 March 2004

WorldPay with C#/.NET [2]

Maybe useful info for those using C# with WorldPay...

WorldPay.COMcallbackClass callback = new WorldPay.COMcallbackClass(); 
callback.processCallback(); 
if (callback.hadError() ) { 
Trace.Write("WorldPay Error", callback.getRawAuthMessage() ) 
} 
if (callback.didTransSuc() ) { 

// #### WORLDPAY demo page #### 
string traceInfo = ""; 
traceInfo +="culture:" + Thread.CurrentThread.CurrentCulture.EnglishName + "\n"; 
traceInfo +="uiculture:" + Thread.CurrentThread.CurrentUICulture.EnglishName + "\n"; 

traceInfo +="getTransId:" + callback.getTransId() + "\n"; 
traceInfo +="getRawAuthMessage:" + callback.getRawAuthMessage() + "\n"; 
traceInfo +="getRawAuthCode:" + callback.getRawAuthCode() + "\n"; 
traceInfo +="getTransTime:" + callback.getTransTime() + "\n"; 
traceInfo +="shopperId:" + callback.shopperId + "\n"; 

traceInfo +="getInstallationId:" + callback.getInstallationId() + "\n"; 
traceInfo +="getCompanyName:" + callback.getCompanyName() + "\n"; 
traceInfo +="getAuthMode:" + callback.getAuthMode() + "\n"; 
traceInfo +="getAmount:" + callback.getAmount() + "\n"; 
traceInfo +="getCurrencyISOCode:" + callback.getCurrencyISOCode() + "\n"; 
traceInfo +="getAmountString:" + callback.getAmountString() + "\n"; 

traceInfo +="getDescription:" + callback.getDescription() + "\n"; 
traceInfo +="Customer getName:" + callback.getName() + "\n"; 
traceInfo +="Customer getAddress:" + callback.getAddress() + "\n"; 
traceInfo +="Customer getPostalCode:" + callback.getPostalCode() + "\n"; 
traceInfo +="Customer getCountryISOCode:" + callback.getCountryISOCode() + "\n"; 
traceInfo +="Customer getTelephone:" + callback.getTelephone() + "\n"; 
traceInfo +="Customer getFax:" + callback.getFax() + "\n"; 
traceInfo +="Customer getEmail:" + callback.getEmail() + "\n"; 

traceInfo +="\nTRANS_ID:" + callback.getParameterString("MC_DataID") + "\n";  // CUSTOM PARAM prefix with MC_ 
traceInfo +="MEMBER_ID:" + callback.getParameterString("MC_MemberID") + "\n"; // CUSTOM PARAM 

Trace.Write("WorldPay", traceInfo);

And if you're programming in Japanese... trying to get Japanese characters to display on the WorldPay pages was a real pain-- encoding issues (i'm using UTF-8 for my site - they use Shift_JIS). To set the job description, name and address info i'm currently using this hack

// so manually converting to HTML Unicode Entities 
// as discussed here http://www.randyrants.com/archives/000348.asp 
int ch1; 
string snail2="";  
foreach (char c in currentTransaction.Desc) { 
ch1 = c; 
if (!((ch1 >= 160) && (ch1 < 256))) {// if in 'ascii' range - *THINK* this works....
snail2 += c; // leave as-is 
} else { // convert to entity format Ӓ 
snail2 += string.Concat("&#", ch1.ToString(System.Globalization.NumberFormatInfo.InvariantInfo), ";"); 
} 
} 
purchase.setDescription (snail2); // HACK: HTML Entity Encoding 
[4-May-04] UPDATED CODE here

i18nguy.com

My job often involves multilingual programming (or at the very least content translation). This site i18nguy.com has an excellent collection of links on Internationalization (I18n), Localization (L10n), Standards etc. I've found it useful on a number of occasions, but sometimes I just go there to browse and learn.

Today's fav page is guidelines and resources.

Thursday 18 March 2004

WayOffTopic: simple file exchange 'extranet'

Part of our everyday business is moving large files back and forth between offices (and countries). Usually FTP does the trick, but more and more clients are having firewall troubles with FTP (and to be honest, I'd rather not have the FTP service running at all).

What I wanted was a simple extranet application with an HTML File Upload and the ability to easily create client 'users' for one-off upload/downloads.

Google found Forms Authentication Using An XML Users File on MSDN, which has almost enough code to get me started.

First time in ages I've found an MS 'sample' useful (except the source projects on asp.net and winforms.net)!

Monday 15 March 2004

SQL Server 2005 "Yukon" Full-Text Search

Interesting article about the 'internals' of the full-text-search in SQL Server, including discussion of ranking.

Data Access & Storage Home: SQL Server "Yukon" Full-Text Search: Internals and Enhancements (Microsoft SQL Server 9.0 Technical Articles)

First time I've seen "Yukon" officially named "SQLServer 2005"... maybe some useful ideas in there for Searcharoo

Sunday 14 March 2004

Simple Site Search in .NET (1)

This article appeared on the MSDN Home Page recently: Part 4: Building a Better Binary Search Tree and it raised my curiousity about using the code provided to build a simple 'search engine' in C#.

I did a quick search to see what else was around (cheap/free products, or sample code) and found a few inspirational articles on WebMonkey, Perlfect and Apache.

A couple more articles on opening a file, removing white space and parsing HTML was all that was needed to get a basic search tool 'up and running' -- look for the code soon at Searcharoo.Net!

After being inspired by it, I didn't implement the Binary Search Tree straight away, that's another story...

WorldPay with C#/.NET

I'm in the process of converting an ecom site from ASP 'Classic' to .NET. The site uses the WorldPay gateway COM.

WorldPay supplies sample ASP files: purchase.asp and callback.asp. Purchase.asp "ported" just fine to C# (after using Reference -> Add Reference -> COM -> WorldPay Select COM to import the DLL).

BUT their callback.asp script failed with the following error

System.Runtime.InteropServices.COMException (0x80004005): aspcomp.AspComponentException: AspComponent: Retreiving MTx object context failed
at System.RuntimeType.ForwardCallToInvokeMember(String memberName, BindingFlags flags, Object target, Int32[] aWrapperTypes, MessageData& msgData)
at WorldPay.COMcallbackClass.processCallback()
at ASP.callback_aspx.Page_Load(Object sender, EventArgs e) in C:\Inetpub\worldpay\callback.aspx:

on

WorldPay.COMcallbackClass callback = new WorldPay.COMcallbackClass();
callback.processCallback();

I couldn't figure out why one page would work and the other not. Surely it's the same DLL being imported/loaded? Changing the Identity in IIS, removing code, adding try/catch... nothing fixed the problem.

Just as I was about to give up and leave that one page in ASP 'Classic', I remembered long ago reading about a 'compability mode' in ASP.NET. A quick Google revealed this doco on aspcompat=true.

And it works! I could not find a single reference to this Googling for 'WorldPay' 'InterOp' 'Error' or any other likely terms -- surely I'm not the first!?!?

[hmm.. my first blog post...]