2012-10-08

Scanning printed photos

I was looking around for advice on what settings are best for scanning printed photos and I was amazed by the number of answers floating around on the great interwebz which are misguided, or are technically correct but miss the point. So, here is my advice.

First of all, let us define the goal: For a home user to scan printed photos so as to retain as much as possible of the visual information contained in the print, within reasonable limits, and without wasting too much space.

The answer, in a nutshell:  Scan at 600dpi, save as 24-bit-color compressed PNG or compressed TIFF. Do not use any of the fancy options for noise reduction, color correction, contrast enhancement, etc that might be provided by your scanner. If you need to improve something, edit the scanned picture later using your favorite image processing software, but never touch the original scanned files: always work on a copy of the original.

If the nutshell is good enough for you, then off you go, and happy scanning.

If you are interested to know why, read on.

Please note that the way the goal was expressed, we do not have to take into consideration what we are planning to do with the pictures later. When you scan, scan for posterity. Scan so that you can later crop, rotate, retouch, print. etc. If the photos need to be communicated through a low-resolution medium such as the web, then you can later make scaled-down copies for that purpose. But at the time of scanning, the point is to not limit yourself on what you will be able to do later with the scanned photos.

The resolution of photo prints ranges between 150 and 300 dpi. When we scan at 600 dpi, we are doing twice the resolution of a photo print, (or better,) which is the theoretical maximum required so as to capture all the information that there is on the photo print. Every bit of it. Not an iota of information left out. Sure, some color fidelity will inevitably get lost, but that's a different ball game altogether. As far as resolution is concerned, anything above 600 dpi is a pure waste of space.

24 bits per pixel is also perfectly adequate to fully match the capacity of film to convey color. There will be some reduction in color fidelity stemming from the fact that the sRGB color space into which you will most likely be scanning will not necessarily (and in all likelihood never) exactly match the color space of the photo, but the loss in fidelity will be imperceptible, and the measures required to correct this problem are disproportionately painstaking, and outside the "within reasonable limits" requirement stated as the goal in the beginning of this article. Furthermore, when you are working with differences in quality that are not perceptible, it is very easy to make a mistake which reduces, rather than enhances, the fidelity of the digitization, even by an imperceptible amount, and you will never know. So, color is best left at 24 bits per pixel sRGB and never messed with.

Use compressed PNG or TIFF because these file formats offer LOSSLESS compression. Lossless compression means that 100% of the information in the image is retained. Not very close to 100%, not imperceptibly different from 100% but precisely 100%.  Time and again I hear about people who save their pictures in uncompressed TIFF format; obviously, they do not understand squat about compression. It really is not rocket science: there are two kinds of compression, lossy and lossless. Lossy compression (JPEG) achieves huge savings, at a slight (usually imperceptible) expense to quality. Lossless compression (PNG, TIFF) achieves great, but not huge savings, at NO expense to quality. Using no compression serves no purpose, and is just plain stupid. It just spreads your picture over more sectors on your hard drive, increasing the chance that it will one day be lost due to a sector going bad.

Since lossy compression usually represents an imperceptible loss of quality, we could be scanning into JPEG, but the problem here is that if we ever retouch the picture, and then save it again as JPEG, the lossy compression will be re-applied, thus compounding the loss of quality. Keep repeating this cycle, and at some point the deterioration will start becoming perceptible. That's why we always use lossless compression on originals and on working copies, and lossy compression when publishing.

2012-07-05

Canon Pixma MX700 printer dead and resurrected (Not!)

My Canon Pixma MX700 printer died the other day as it was printing. It just went completely dead, as if the power cord was unplugged. Today I decided to troubleshoot it. First, I tried a different power outlet. That was not it. Then, I checked the power cord. No problem there. Then, I took out the power supply box and disconnected it from the printer, then I reconnected it and put it back in. No improvement. Then I opened up the power supply box and examined the circuit board in it.

Canon Pixma MX700


I am not terribly familiar with hardware, so I could not tell if the smell was of burned electronic components or if it was just the regular smell of electronics that have been sitting inside a closed box for a few years. Nothing looked burned though, and all the electrolytic capacitors seemed intact. I located 3 fuses on the board, and I checked each one of them for continuity. They were all fine.

So, I decided to give up, and I started putting things back together. I placed the circuit board back in the power supply box, I closed the box, and then for some reason I did something backwards: first I connected the power chord to the power supply box, and then I connected the power supply box to the printer.  As I was attaching the connector, I noticed that one of the pins was momentarily making a little spark. And then lo and behold, the printer came back to life! The printer had fixed itself!

And that, ladies and gentlemen, is why I hate hardware.

UPDATE 2012/07/13: nope, the resurrection was only temporary. The printer is dead. Dead as a doornail. And that, ladies and gentlemen, is why I hate hardware even more.

2012-05-02

Solved: Local resources unavailable on remote desktop

I experienced this problem today, drive C: of my local computer "Pegasus" was not appearing on the remote computer as "C on PEGASUS" when I connected to the remote computer via Remote Desktop (Terminal Services.)  All other drives of Pegasus were showing fine on the remote computer, but the one I actually needed (C:) was not.  The drive did not even appear under "\\tsclient" in "Network Places".

Judging by the problems reported by people from all over the world who have this problem and are searching for solutions on the interwebz, it may happen with any local resource, like printers, the clipboard, etc.

Luckily, I found a solution to the problem:

Terminate the RDP session not by closing the RDP window, but by actually logging off. Then, start a new RDP session, and the problem will have most likely gone away.

It is unclear why local resources sometimes fail to show on the remote computer during an RDP session; it is one of those things that "just happen", and that tend to go away if you just "close and reopen it". (Or get out of the car and get back in again, as the joke goes.) The reason for the frustration with this particular problem is that more often than not we do not really "close and reopen it", because we tend to just close the RDP window, which does not terminate our logon session with the server. By logging off and connecting again, the logon session gets restarted, and that's the "close and reopen" needed to fix the problem.

2012-04-27

.Net code running faster under the profiler?

So, today it occurred to me that the C# application that I am developing is a bit too slow on startup, and I decided to throw the visual studio profiler at it to see if I have goofed up somewhere. To my astonishment, under the profiler my app ran 10 times faster. The slowness I wanted to troubleshoot was nowhere to be found.

I also tried running the release version, and as I expected it performed better than the debug version under the profiler, so the universe was still in its place, but still, I would very much like to know what the profiler did that made the debug version of my app run so much faster. For one thing, it would be a great convenience to be able to enjoy this speedup while developing; waiting for 2 instead of 20 seconds for my app to start every time I want to check something would be very good for productivity.

I tried my luck with various google searches, and I found a couple of articles on StackOverflow, but none pointed at the exact cause of the problem.

Luckily, after quite a bit of hard thinking, troubleshooting, and browsing through the myriad of potentially relevant settings in Visual Studio, I found the answer: 

It is the "Enable unmanaged code debugging" feature.

In Visual Studio this feature is not under "Tools / Options / Debugging", (because that would make too much sense,) it is under "Project / Properties / Debug".  Enabling that feature makes everything slow as molasses. The profiler disables the debugger, and that feature with it, so the application appears to run lightning fast.

Here is a StackOverflow question to which I added my newly acquired wisdom:

stackoverflow.com: Launching VS Profiler boosts Application Performance x20?


2012-01-13

The "Handoff" Pattern

I had been thinking about posting this for quite some time now, and all by coincidence I happened to get a chance to mention it just the other day in an answer that I wrote to a question on Programmers-StackExchange. So, here it is in a more formal way:

If class M stores or manipulates or in any other way works with instances of destructible (disposable) class D, it may not assume the responsibility to destruct these instances, unless it is explicitly told that ownership of these instances is transferred to it. Therefore, class M must accept a boolean called 'handoff' as a construction-time parameter, stating whether instances of D are being handed off to it, and it can therefore destruct them when it is done with them.

Example:
    //Note: the IReader interface extends IDisposable
    IReader reader = new BinaryStreamReader( ... );
    reader = new BufferedStreamReader( reader, handoff:true );
    try
    {
        /* use the reader interface */
    }
    finally
    {
        reader.Dispose(); //this destructs the buffered stream reader, and 
                          //destruction cascades to the binary stream
                          //reader because handoff was specified.
    }

Example:
    var collection = new CollectionOfDestructibles( handoff:true );
    collection.Add( new Destructible( 1 ) );
    collection.Add( new Destructible( 2 ) );
    collection.Add( new Destructible( 3 ) );
    collection.Dispose(); //this destructs the collection and every single
                          //one of its contents, since handoff was specified.

In languages which support optional parameters, the 'handoff' parameter should default to false.

2012-01-06

C# Blooper №3: No warnings about fields having already been initialized.


Before reading any further, please read the disclaimer.

When you declare a member variable and you pre-initialize it at the same time, and then you try to re-initialize it within the constructor without ever making use of its original pre-initialized value, you receive no warning about the field having already been initialized.

namespace Test3 
{  
    public class Test 
    {  
        public readonly string m = "m"; 
        public string n = "n"; 
        private string o = "o"; 
        protected readonly string p = "p"; 
        protected string q = "p"; 
        private string r = "r"; 

        Test() 
        {  
            m = "m2"; //Blooper: no warning about field having already been initialized. 
            n = "n2"; //Blooper: no warning about field having already been initialized. 
            o = "o2"; //Blooper: no warning about field having already been initialized. 
            p = "p2"; //Blooper: no warning about field having already been initialized. 
            q = "q2"; //Blooper: no warning about field having already been initialized. 
            r = "r2"; //Blooper: no warning about field having already been initialized. 
            o.ToLower(); //to prevent Warning CS0414: The field is assigned but its value is never used. 
            r.ToLower(); //to prevent Warning CS0414: The field is assigned but its value is never used. 
        }  
    } 
}  

This means that you may accidentally invoke complex initialization logic twice, unnecessarily wasting memory and clock cycles, and it may also lead to logic errors, if by any chance that initialization logic has side effects which are only meant to occur once. It may also confuse someone reading your code, (or even yourself looking at your code months later,) trying to figure out what's the purpose behind the seemingly repeated initialization, before the realization sinks in that it is simply redundant. Furthermore, if the re-initialization happens to differ from the pre-initialization, a good question arises, asking which one of the two was meant to be the correct one.

It is a pity, because the compiler could warn the programmer against this pitfall.

Also see related post: C# Blooper №2: No warnings about accessing uninitialized members.

-

C# Blooper №2: No warnings about accessing uninitialized members.


Before reading any further, please read the disclaimer.

When you declare a member variable, and then you try to read it from within the constructor without having first initialized it, you receive no warning about accessing an uninitialized member. This happens even if the member is declared as readonly.

namespace Test2  
{  
    public class Test 
    {  
        public readonly string m; 
        public string n; 
        protected readonly string o; 
        protected string p; 
        private readonly string q; 
        private string r; 

        Test() 
        {  
            m.ToUpper(); //Blooper: no warning about accessing uninitialized member. 
            n.ToUpper(); //Blooper: no warning about accessing uninitialized member. 
            o.ToUpper(); //Blooper: no warning about accessing uninitialized member. 
            p.ToUpper(); //Blooper: no warning about accessing uninitialized member. 
            q.ToUpper(); //Blooper: no warning about accessing uninitialized member. 
            r.ToUpper(); //Blooper: no warning about accessing uninitialized member. 
            q = "q"; //to prevent Warning CS0649: Field is never assigned to, and will always have its default value null 
            r = "r"; //to prevent Warning CS0649: Field is never assigned to, and will always have its default value null 
        }  
    } 
}  

Someone might argue that this is behavior is fine because the member in question is guaranteed to contain its default value. First of all, a readonly member containing its default value is completely useless. (See C# Blooper №1: No warnings about uninitialized readonly members when the class is public and the member is public, protected or protected internal.) Secondly, if the compiler is to help the developer catch potential errors and write better code, this is not a valid excuse: a different strategy is necessary.

If the programmer intends the member to contain its default value, then the programmer ought to explicitly state so. Failing to do so ought to imply intention to initialize the member later on, and certainly before any attempt is made to read the member.  This way, the programmer can have it both ways: they can have members pre-initialized to their default values, and they can receive warnings when they fail to initialize members.

Also please note that the compiler is capable of detecting that the value with which a member is being explicitly initialized is the default value for the type of the member, and so it can refrain from emitting any additional code for the assignment; thus, there is no performance issue.

Also see related post: C# Blooper №3: No warnings about fields having already been initialized.

-

2012-01-03

Pernicious Local Variable Initialization

Again and again I see programmers doing the following:
    string variable = string.Empty; 
    if( some.condition )
        variable = some.value;
    else
        variable = some.other.value;
You may see it with setting strings to String.Empty as in the example, or you may see it with setting integers to zero, pointers to null, etc. A very large number of programmers believe that when declaring a local variable you should always pre-initialize it with some value. The belief is so popular, that it practically enjoys "common knowledge" status, and is considered "best practice", despite being dead wrong. I call it The Practice of Pernicious Local Variable Initialization. Here is why.

The practice was not always bad. It started back in the dark ages of the first C compilers, when it was kind of a necessity. Compilers back then had a combination of unfortunate characteristics:

2011-12-31

C# Blooper №1: No warnings about uninitialized readonly members when the class is public and the member is public, protected or protected internal.


Before reading any further, please read the disclaimer.

The C# compiler is kind enough to give you a "field is never assigned to" warning if you forget to initialize a readonly member which is private or internal, or if the class in which it is being declared is internal. But if the class is public, and the readonly member is public, protected or protected internal, then no warning for you! Why, oh why?

namespace Test1  
{  
    class Test1 
    {  
#if TRY_IT  
        public readonly int m; //OK: warning CS0649: Field is never assigned to, and will always have its default value 0  
        protected readonly int n; //OK: warning CS0649: Field is never assigned to, and will always have its default value 0  
        internal readonly int o; //OK: warning CS0649: Field is never assigned to, and will always have its default value 0  
        private readonly int p; //OK: warning CS0649: Field is never assigned to, and will always have its default value 0  
        protected internal readonly int q; //OK: warning CS0649: Field is never assigned to, and will always have its default value 0  
         
        Test1() 
        { 
            if( p != 0 ) //To avoid warning 'The field is never used' 
                return; 
        } 
#endif 
    }  
  
    public class Test2 
    {  
#if TRY_IT  
        private readonly int m; //OK: warning CS0649: Field is never assigned to, and will always have its default value 0  
        internal readonly int n; //OK: warning CS0649: Field is never assigned to, and will always have its default value 0  
 
        Test2() 
        { 
            if( m != 0 ) //To avoid warning 'The field is never used' 
                return; 
        } 
#endif  
        public readonly int o; //Blooper: no warning about field never assigned to.  
        protected readonly int p; //Blooper: no warning about field never assigned to.  
        protected internal readonly int q; //Blooper: no warning about field never assigned to. 
    }  
  
    public sealed class Test3 
    {  
        public readonly int m; //Blooper: no warning about field never assigned to.  
    }  
}  

For a moment you might think "well, a descendant might initialize that member", but that theory does not hold any water, for a number of reasons:
  • Internal classes may also be subclassed, but the compiler does not fail to issue the warning in their case.
  • Sealed classes may not be subclassed, but the compiler fails to issue the warning in their case, as Test3 in the sample code demonstrates.
  • The warning makes sense for the sake of the integrity of the base class regardless of what a derived class may or may not do.
  • Lastly but most importantly, the C# specification expressly prohibits a derived class from initializing a readonly member of a base class. You get Error CS0191: A readonly field cannot be assigned to (except in a constructor or a variable initializer) which, incidentally, is a little bit misleading, because you may be trying to assign the field from within a constructor, only it is the constructor of the wrong class.
According to MSDN Documentation about this warning, the exhibited behavior is to be expected:
Compiler Warning (level 4) CS0649:
Field 'field' is never assigned to, and will always have its default value 'value'
The compiler detected an uninitialized private or internal field declaration that is never assigned a value.
The question is: why?

UPDATE:

I posted this question on StackOverflow, and Eric Lippert himself answered it. The short answer is that it is an oversight of the compiler, but the long answer is also quite interesting and worth reading.
-

C# Bloopers


Please do not get me wrong; C# is awesome. It is the language of my choice, even though I am pretty well versed in C++ and Java. That having been said, it cannot be denied that C# has its share of flaws, too. In this series of posts I am documenting some of them, in no particular order.

Also please note that many of the issues described herein are Visual C# bloopers, not C# bloopers in general.

C# Blooper №1: No warnings about uninitialized readonly members when the class is public and the member is public, protected or protected internal.

C# Blooper №2: No warnings about accessing uninitialized members.

C# Blooper №3: No warnings about fields having already been initialized.

C# Blooper №4: Lame/annoying variable scoping rules, Part 1

C# Blooper №5: Lame/annoying variable scoping rules, Part 2

C# Blooper №6: No warnings about unused parameters.

C# Blooper №7: No warnings about unused private methods.

C# Blooper №8: No warnings for conditions that are always true/false

C# Blooper №9: Annoying case statement fall-through rules.

C# Blooper №10: Switch statements are not properly formatted.

C# Blooper №11: Zero to Enum conversion weirdness

C# Blooper №12: 'Where' constraints not included in method signatures

C# Blooper №13: C# Blooper №13: Stack and Queue do not implement ICollection

C# Blooper №14: Weird / annoying interface method visibility rules.

Stay tuned, there is more to come.
-