Hi Dave! Thanks for responding!
>This does not sound like a common problem. You should also >check your system log messages, which will include the kernel >messages as well as messages from cardmgr.
Actually, the messages I included in my original posting *are* from the system log (I removed the leading timestamp as it didn't really provide any useful extra information). What you see is what you get.
>The errors at insertion time are not a problem. They pop up >for some kernel versions, when the IDE driver tries to issue >a "door lock" command that the card doesn't recognize. The >driver won't try this again if it fails the first time.
I figured as much and have been ignoring those messages.
>PCMCIA under SMP is not a particularly well tested >configuration, and I can't test it myself, so I can't really >rule out the possibility that the code contains SMP bugs.
Is there anything I can do to enable debug output to help trouble shoot the problem? Debugging modules isn't my speciality :-( Or is there a place in the cardmgr where I can place a break-point?
>Looking at the eject detection code, I don't really see how >it could be sensitive to whether it is running on an SMP >kernel or not. But I also don't see how the hardware could
Since the amount of data/time between hangs varys somewhat, I'd guess there's a race condition that is SMP dependent. Again, if I rebuild the kernel+pcmcia-cs for non-SMP, everything works as expected. Exact same hardware config and software.
>mis-report an eject in this situation. Do you think all your >hardware should be solid? (no overclocking, power supply >has plenty of capacity, no other strange symptoms while >running SMP?)
I've scrutinized the hardware pretty thoroughly. The load for all the drives and peripherals I have are well within the tollerance of the power-supply (it's a 400W PS). No other software/hardware problems, and I drive this machine pretty hard sometimes (but not while using pcmcia cards ;-).
The card read/writer is an ISA card, but I have several ISA cards plugged in and they all work perfectly.
>You can effectively disable the test for an ejected card by >setting PCIC_OPTS="poll_interval=2000000000". You would then >need to issue "cardctl insert" and "cardctl eject" commands >by hand. If this is an effective work-around, that would at >least narrow down the location of the problem.
I'll try this, but I'm not sure weither an "eject" is being erroneously generated, or if the interrupt is being missed somehow and an error condition is being raised as an "eject". The timing is too close to tell from the logs. And once the "Busy" occurs, I've very limited as to what I can do to probe for information. Again, if you've got any ideas on getting more diagnostics out of this I will try and generate it. Any suggestions on things to try? What extra information about my system can I provide to help identify possible problem areas?