In my travels I have used various wireless networks, both free and paid for that have had varying levels of brokenness. The worst ones I have encountered are the ones that have their DNS servers set up to wildcard all domains AND prevent name look ups to any servers but their own, one rung up from these network bottom feeders are the ones that just have their DNS set up to wildcard.
Both of these setups screw me over on the laptop because I have a tunnel set up using stunnel that allows postfix on my laptop to securely deliver mail to my ISP's secure mail relay. To do this postfix wants to resolve localhost.localdomain which these broken network set ups resolve to something other than 127.0.0.1 which, of course, breaks the whole process.
To try and work around this problem I set up BIND on my laptop so that rather than relying on (possibly broken) external DNS providers I had my own. I added some functions to /etc/dhclient-enter-hooks to prevent my /etc/resolv.conf being overwritten. This worked well for every thing but the worst network offenders that prevent DNS look ups to anything but their own infrastructure. I was relenting, using their DNS and living with the fact that I couldn't send emails until I connected to a better configured network. That was until one day when I was moaning about this situation online and somebody suggested setting up BIND to use the DNS servers given to me as forwarders. This is a great idea, it allows me to protect myself from the broken idea of wildcarding .localdomain but still have DNS that works with even the most broken network setups. I set about configuring this up.
Firstly in /etc/named.conf I added the following to the options section:
include "/etc/namedb/forwarders";
This will cause BIND to include the contents of the given file. To create the file we need to hook into the information retrieved by dhclient, this is done adding the following function to /etc/dhclient-enter-hooks:
make_resolv_conf() {
if [ -f /etc/namedb/forwarders ]
then
mv /etc/namedb/forwarders /etc/namedb/forwarders.old
fi
printf "forwarders { " > /etc/namedb/forwarders
for nameserver in $new_domain_name_servers
do
printf "%s; " ${nameserver} >> /etc/namedb/forwarders
done
echo "};" >> /etc/namedb/forwarders
echo "forward only;" >> /etc/namedb/forwarders
pkill -HUP named
return 0
}
The above shell function takes the DNS server list given in the variable new_domain_name_servers and formats up a valid BIND forwarders statement, after writing out the forwarders into the file it gives named a HUP signal to get named to re-read the configuration. Of course, /etc/resolv.conf is untouched here and is simply configured to query localhost.
So far this configuration has allowed everything on my laptop to operate correctly irrespective of how badly set up the DNS is on the network I am connecting to. Win.
Saturday, October 27, 2012
Tuesday, September 11, 2012
Error messages
I love error messages, especially the ones that make you think that something else is wrong so you beat your head trying to bash the wrong thing into shape.
I spent quite a few hours trying to get a solaris jumpstart on a machine that is roughly 1200km from the jumpstart server to actually do the install. This was a slow tedious process because the round trip time made everything take a long time. The error I was getting was something like:
Error: Unable to create 9680Mb slice (unnamed)
and then a dump of the mbr and partition table of the disk in the machine. Giving one the impression that somehow the solaris installer was objecting to the partitions/mbr currently on the disk. No amount of clearing nor manually setting up of the partitions would make the error go away.
The real problem? The fact that I had foolishly ASSuMEd that leaving the directive for creating a mirrored pool on a single disk machine would be harmless.... oh no. What I think the error really meant was "you told me to make a mirror and I can't find another device to do that on". It would have been nice if the error had been along the lines of "insufficient vdevs for mirror" or something along those lines. After I removed the mirror option from the pool setup directive the error went away... *sigh*.
I spent quite a few hours trying to get a solaris jumpstart on a machine that is roughly 1200km from the jumpstart server to actually do the install. This was a slow tedious process because the round trip time made everything take a long time. The error I was getting was something like:
Error: Unable to create 9680Mb slice (unnamed)
and then a dump of the mbr and partition table of the disk in the machine. Giving one the impression that somehow the solaris installer was objecting to the partitions/mbr currently on the disk. No amount of clearing nor manually setting up of the partitions would make the error go away.
The real problem? The fact that I had foolishly ASSuMEd that leaving the directive for creating a mirrored pool on a single disk machine would be harmless.... oh no. What I think the error really meant was "you told me to make a mirror and I can't find another device to do that on". It would have been nice if the error had been along the lines of "insufficient vdevs for mirror" or something along those lines. After I removed the mirror option from the pool setup directive the error went away... *sigh*.
Monday, September 10, 2012
NetBSD on a stick
This is not really a new idea but after a couple of upgrade "oopses" that rendered my machine unbootable due to init getting upset about not finding MAKEDEV I decided I needed a safety net that was a little bit more portable than the USB hard disk enclosure that had saved me in the past.
I had a spare 4Gb USB memory stick which is more than ample to hold a full NetBSD install including X and a few other bits and pieces so I decided to put the memory stick to good use. The advantage of using a memory stick is that it is easy to update, we have the entire NetBSD toolset available, not some cut down version. The downside is that the memory stick does have a limited number of writes so one wouldn't last very long for full time day to day use but as a portable rescue device or even just to be used to test new hardware before buying it can be useful.
On my laptop the memory stick appeared as sd0 so I will use this device in all my examples, for those following along at home you may need to adjust your device naming.
Firstly, I needed to adjust the mbr. The memory stick had a single MSDOS partition on it, this needed to be changed to be a NetBSD one (type 169). To do this I ran fdisk:
fdisk -u /dev/sd0d
disklabel -e /dev/sd0d
a: 7883696 1104 4.2BSD 2048 16384 0 # (Cyl. 0*- 3849)
newfs /dev/sd0a
mount /dev/sd0a /mnt
cdir=`pwd`
for f in *.tgz
do
(cd /mnt && tar zxpf ${cdir}/${f})
done
cp /usr/mdec/boot
installboot -fv /dev/sd0d /usr/mdec/bootxx_ffsv1
/dev/sd0a / ffs rw,log 1 1
I had a spare 4Gb USB memory stick which is more than ample to hold a full NetBSD install including X and a few other bits and pieces so I decided to put the memory stick to good use. The advantage of using a memory stick is that it is easy to update, we have the entire NetBSD toolset available, not some cut down version. The downside is that the memory stick does have a limited number of writes so one wouldn't last very long for full time day to day use but as a portable rescue device or even just to be used to test new hardware before buying it can be useful.
On my laptop the memory stick appeared as sd0 so I will use this device in all my examples, for those following along at home you may need to adjust your device naming.
Firstly, I needed to adjust the mbr. The memory stick had a single MSDOS partition on it, this needed to be changed to be a NetBSD one (type 169). To do this I ran fdisk:
fdisk -u /dev/sd0d
The -u flag makes fdisk interactive, prompting you to update certain parameters as you go along. For my purposes I just accepted the defaults to everything until I was prompted to change a partition. At this point I just selected the only partition there and changed the sysid to 169 and set a bootmenu label of NetBSD, all other parameters were left at their default. Once I had finished editing the partitions I accepted the default "none" at the partition prompt to move on. Fdisk prompted me to update the mbr to which I said yes and also said yes to it updating the mbr type to bootsel since I had a boot menu.
Once the mbr was set it was time to edit the disklabel:
disklabel -e /dev/sd0d
This opens up the disk label in an editor. I left most of the settings as is only editing the e: partition, I changed the e: into an a: and change the fstype to "4.2BSD", set the fsize to 2048, the pgsize to 16384 and the cpg/sgs to 0. The line ended up looking like:
a: 7883696 1104 4.2BSD 2048 16384 0 # (Cyl. 0*- 3849)
Then I just saved the changes and quit the editor. With the new disklabel in place I could then create the filesystem:
newfs /dev/sd0a
This took a while to do but eventually completed without problem. Once the file system was built I mounted it:
mount /dev/sd0a /mnt
Then unpacked the installation sets onto the memory stick. I had created my own release files using "build.sh release" but I could have easily just downloaded the official sets, the files I needed were under the binary/sets directory. I just cd'ed to that directory and did:
cdir=`pwd`
for f in *.tgz
do
(cd /mnt && tar zxpf ${cdir}/${f})
done
and waited for a long time, the unpacking was very slow. Once this was done I picked the kernel-GENERIC.gz file from binary/kernel, uncompressed this and copied as /mnt/netbsd.
To make the memory stick bootable I did:
cp /usr/mdec/boot
installboot -fv /dev/sd0d /usr/mdec/bootxx_ffsv1
I had to use the -f flag on installboot because I was getting an error "Old BPB too big", the -f flag forced installboot to continue anyway, the result seemed to work fine.
I created a new fstab:
/dev/sd0a / ffs rw,log 1 1
and edited /mnt/etc/rc.conf to change "rc_configured" to "YES" so the machine would come up multiuser.
After this it was just a matter of rebooting my machine, the BIOS picked up a USB device and started a boot from it. I was presented with a boot menu, when I selected the only option my machine slowly booted from the memory stick. I was able to log in and do everything you would expect to be able to do on a clean NetBSD install.
Saturday, August 25, 2012
Linux Lab in the Lap
I am a RHCE and have been working toward getting a RHCA which involves sitting five "certificate of expertise" exams which are deep dive technical exams on various Red Hat products such as satellite, cluster, naming services and a couple of others. To help me prepare for the exams I wanted to set up a lab environment that emulated as closely as possible the one used in the training courses that are associated with the exams. Fortunately I had been generous with most of the specifications on my laptop so, apart from disk space, I had the capability to set up a virtual environment on my laptop. I upgraded the laptop harddisk to give me the room and developed the "lab in the lap".
Originally I developed this running on RHEL 5 which was reasonably challenging since it involved a bunch of manual tweaking and hacking on xen infrastructure scripts directly (even finding out what to do proved difficult). I have recently updated the environment to RHEL 6 which
proved to be much easier to configure up.
This picture shows that the lab configuration looks like:
There are three isolated networks cluster, iscsi1 and iscsi2. Another network, appnet, is NAT'ed to the hosts network interface to provide external access if required. Each machine has four network interfaces. All three VM nodes (node 1 - 3) are built by first creating a template node (call it node0) using kickstart. The subsequent nodes are created by cloning node0. To simplify the cloning process the IP addresses for all the clone nodes are assigned using DHCP - this means that there are
no node specific actions to be done.
To create the setup I first set up all the networks using the virtual machine manager. This is a bit tedious to do but not difficult. In RHEL 6 you can also use the VMM to assign addresses to the virbr NICs
that are created when you create an isolated network, this provides the host access to the isolated networks. In RHEL 5 setting up the network interfaces was a bit trickier, first I had to make sure I had enough "dummy" network interfaces by adding:
to /etc/modprobe.conf. Then I created a script in /etc/xen/scripts called multi-network-bridge which set up the multiple bridges and the associated dummy interfaces:
Finally, I modified /etc/xen/xend-config.sxp and replaced argument for the network-script entry with my own multi-network-bridge script so that xend would run my script on start up.
For the isolated networks, iptables was blocking network access to the host. Libvirtd automatically configures iptables to provide the NAT facility for guests. I added a rule to allow the 172.17.0.0/16 network access on any port to allow the guests to communicate with the host on the other ethernet interfaces. Unfortunately you cannot just reload the iptables rules because this flushes the rules added by libvirtd which breaks the virtual networking.
With the networks set up it was then time to kickstart node0, the template to be used to stamp out the other nodes. On my host I set up directory with the contents of the RHEL distro under /var/www/html/rhel-5.6 and pointed the yum host config at this. I installed httpd to support a network kickstart. I dropped a basic kickstart file into /var/www/html/ks and then used:
Where 172.17.1.254 is the IP address of one of the dummy/virbr-nic interfaces on the host machine. The MAC address comes from the manually allocated xen ethernet range. To make things simple I allocated the MAC addresses with the second to last octet being the node number and the
last octet being the interface number of the VM. This made the scripting of the cloning simpler. With this scheme I configured static DHCP allocations for all the nodes. This can be done using "virsh
net-edit net_name", e.g:
to edit the cluster net and add the static definitions in the dhcp section of the xml list this:
<name>cluster</name>
The host entries above are the static address allocations. Once this file is edited I found I had to manually copy the xml file from /etc/libvirt/qemu/networks into /var/lib/libvirt/network/. It seems that the VMM manages the files in both places but virsh doesn't and the changes will not take effect. In RHEL6 I found I could restart libvirtd (service libvirtd restart), then send a SIGHUP to dnsmasq, then restart libvirtd once again. The first restart writes out the dnsmasq control files, then we can kick dnsmasq, then restart libvirtd again because it seems to lose contact with dnsmasq following the SIGHUP which breaks DHCP. I don't know if this process works on RHEL5, I just rebooted after making changes back then.
Once the template node was built I cloned it using:
Doing a clone is slightly faster than building the node using kickstart, once the node is cloned I set the VM to autostart on boot and start up:
That's it, a node ready for experimenting with. The labs I was doing called for up to three nodes running, this was a simple matter of creating more clones. I wrote a shell script that manages the building and cloning of the nodes, the command line arguments are similar to those of the Red Hat script used in the actual labs though I have added extensions to build more/different nodes so I can use the lab setup for more than one course.
I found when building the RHEL 6.2 clones that the ethernet interfaces were numbered eth4 to eth7 instead of eth0 to eth3. This was due to a set of udev rules that are intended to attach a particular ethernet interface to a particular device. This makes sense in a real world machine because it keeps the interface numbering consistent even when an intferace goes missing but, in my case, this feature was not desirable so I had to add a post action to the RHEL 6.2 kickstart to remove the udev rules. I have also noticed that since upgrading the host to RHEL 6.2 that my cloned machines no longer boot properly when the template machine is not built with a graphical console. This used to work fine
when the host OS was RHEL 5.6. Trying to debug this is a bit awkward because there is no console access nor is the networking coming up. I guess I could loop mount the disk image and see if there are any logs but for the moment I can put up with making the template node with a graphical install.
Footnotes:
Originally I developed this running on RHEL 5 which was reasonably challenging since it involved a bunch of manual tweaking and hacking on xen infrastructure scripts directly (even finding out what to do proved difficult). I have recently updated the environment to RHEL 6 which
proved to be much easier to configure up.
This picture shows that the lab configuration looks like:
There are three isolated networks cluster, iscsi1 and iscsi2. Another network, appnet, is NAT'ed to the hosts network interface to provide external access if required. Each machine has four network interfaces. All three VM nodes (node 1 - 3) are built by first creating a template node (call it node0) using kickstart. The subsequent nodes are created by cloning node0. To simplify the cloning process the IP addresses for all the clone nodes are assigned using DHCP - this means that there are
no node specific actions to be done.
To create the setup I first set up all the networks using the virtual machine manager. This is a bit tedious to do but not difficult. In RHEL 6 you can also use the VMM to assign addresses to the virbr NICs
that are created when you create an isolated network, this provides the host access to the isolated networks. In RHEL 5 setting up the network interfaces was a bit trickier, first I had to make sure I had enough "dummy" network interfaces by adding:
options dummy numdummies=5
to /etc/modprobe.conf. Then I created a script in /etc/xen/scripts called multi-network-bridge which set up the multiple bridges and the associated dummy interfaces:
#!/bin/sh
/etc/xen/scripts/network-bridge $@ vifnum=0 netdev=eth0 bridge=xenbr0
/etc/xen/scripts/network-bridge $@ vifnum=1 netdev=dummy0 bridge=virbr1
ifconfig dummy0 172.16.50.254 netmask 255.255.255.0 broadcast 172.16.50.255 up
/etc/xen/scripts/network-bridge $@ vifnum=2 netdev=dummy1 bridge=virbr2
ifconfig dummy1 172.17.1.254 netmask 255.255.255.0 broadcast 172.17.1.255 up
/etc/xen/scripts/network-bridge $@ vifnum=3 netdev=dummy2 bridge=virbr3
ifconfig dummy2 172.17.101.254 netmask 255.255.255.0 broadcast 172.17.101.255 up
/etc/xen/scripts/network-bridge $@ vifnum=4 netdev=dummy3 bridge=virbr4
ifconfig dummy3 172.17.201.254 netmask 255.255.255.0 broadcast 172.17.201.255 up
Finally, I modified /etc/xen/xend-config.sxp and replaced argument for the network-script entry with my own multi-network-bridge script so that xend would run my script on start up.
For the isolated networks, iptables was blocking network access to the host. Libvirtd automatically configures iptables to provide the NAT facility for guests. I added a rule to allow the 172.17.0.0/16 network access on any port to allow the guests to communicate with the host on the other ethernet interfaces. Unfortunately you cannot just reload the iptables rules because this flushes the rules added by libvirtd which breaks the virtual networking.
With the networks set up it was then time to kickstart node0, the template to be used to stamp out the other nodes. On my host I set up directory with the contents of the RHEL distro under /var/www/html/rhel-5.6 and pointed the yum host config at this. I installed httpd to support a network kickstart. I dropped a basic kickstart file into /var/www/html/ks and then used:
virt-install --name=node0 --ram=512 --vcpus=1 --disk path=/var/lib/images/node0.img,size=8 --extraargs="ksdevice=eth0 ks=http://172.17.1.254/ks/node0.ks" --mac 00:16:3e:00:00:00 --network network:cluster --network network:appnet --network network:iscsi1 --network network:iscsi2 --location=/var/www/html/rhel-5.6/
Where 172.17.1.254 is the IP address of one of the dummy/virbr-nic interfaces on the host machine. The MAC address comes from the manually allocated xen ethernet range. To make things simple I allocated the MAC addresses with the second to last octet being the node number and the
last octet being the interface number of the VM. This made the scripting of the cloning simpler. With this scheme I configured static DHCP allocations for all the nodes. This can be done using "virsh
net-edit net_name", e.g:
virsh net-edit cluster
to edit the cluster net and add the static definitions in the dhcp section of the xml list this:
<name>cluster</name>
<uuid>f0ddb5c3-7db7-9943-2195-aa0454971d0d</uuid>
<bridge name='virbr2' stp='on' delay='0' />
<ip address='172.17.1.253' netmask='255.255.255.0'>
<dhcp>
<range start='172.17.1.128' end='172.17.1.250' />
<host mac='00:16:3e:00:01:00' ip='172.17.1.1' name='node1' />
<host mac='00:16:3e:00:02:00' ip='172.17.1.2' name='node2' />
<host mac='00:16:3e:00:03:00' ip='172.17.1.3' name='node3' />
</dhcp>
</ip>
</network>
The host entries above are the static address allocations. Once this file is edited I found I had to manually copy the xml file from /etc/libvirt/qemu/networks into /var/lib/libvirt/network/. It seems that the VMM manages the files in both places but virsh doesn't and the changes will not take effect. In RHEL6 I found I could restart libvirtd (service libvirtd restart), then send a SIGHUP to dnsmasq, then restart libvirtd once again. The first restart writes out the dnsmasq control files, then we can kick dnsmasq, then restart libvirtd again because it seems to lose contact with dnsmasq following the SIGHUP which breaks DHCP. I don't know if this process works on RHEL5, I just rebooted after making changes back then.
Once the template node was built I cloned it using:
virt-clone -o node0 --force --mac 00:16:3e:00:01:00 --mac 00:16:3e:00:01:03 --mac 00:16:3e:00:01:01 --mac 00:16:3e:00:01:02 --name node1 --file /var/lib/images/node1.img
Doing a clone is slightly faster than building the node using kickstart, once the node is cloned I set the VM to autostart on boot and start up:
virsh autostart node1
virsh start node1
That's it, a node ready for experimenting with. The labs I was doing called for up to three nodes running, this was a simple matter of creating more clones. I wrote a shell script that manages the building and cloning of the nodes, the command line arguments are similar to those of the Red Hat script used in the actual labs though I have added extensions to build more/different nodes so I can use the lab setup for more than one course.
I found when building the RHEL 6.2 clones that the ethernet interfaces were numbered eth4 to eth7 instead of eth0 to eth3. This was due to a set of udev rules that are intended to attach a particular ethernet interface to a particular device. This makes sense in a real world machine because it keeps the interface numbering consistent even when an intferace goes missing but, in my case, this feature was not desirable so I had to add a post action to the RHEL 6.2 kickstart to remove the udev rules. I have also noticed that since upgrading the host to RHEL 6.2 that my cloned machines no longer boot properly when the template machine is not built with a graphical console. This used to work fine
when the host OS was RHEL 5.6. Trying to debug this is a bit awkward because there is no console access nor is the networking coming up. I guess I could loop mount the disk image and see if there are any logs but for the moment I can put up with making the template node with a graphical install.
Footnotes:
- For those 1337 h4x0rz out there interested in cracking the root password in the linked kickstart file, here's a hint... try redhat to save yourself a bit of time.
- The course I attended wasn't actually on RHEL6 it was run on RHEL5.8, not that this really presented an issue as I just added the distro files in and setup a kickstarted template for that OS version. I was able to complete all the labs with this setup. The practice really helped in the exam, I was able to complete the exam in half the allotted time with a score of 100%.
Friday, August 03, 2012
lab in the lap adventures
To help me with practicing for a Red Hat certification exam I set up RHEL 5.6 on my laptop. I configured up xen and was using a bunch of virtual machines, each with multiple network interfaces connected to isolated networks. For convenience all the interfaces were configured using DHCP, this way I could build one VM and just clone it to make the other machines - during the clone the network interfaces were given different mac addresses and DHCP just handed out the static allocations for those MAC addresses. This all worked quite smoothly, though it did take a lot of work and searching to put all the bits together and it did help me a lot in going over all the lab exercises before my exam.
I am going to do another course and exam soon but the course will be using RHEL 6.2 for the OS so I thought it would be good to update my install. Red Hat's recommendation is that you do a clean install between major versions but I really didn't want to lose my xen configuration so I attempted to perform an upgrade instead which you can do by hitting escape within the first 60 seconds of the cdrom boot and then adding:
linux upgradeany
to the boot command line. This forces the upgrade regardless of the version. The upgrade seemed to work ok but the first hurdle was the process had not written a new entry in grub.conf to boot the upgraded system. This was easy to fix, just boot up the cd in rescue mode and fix. I then tried to get X working properly but no matter what I did I would either get a black screen or X would come up and just hang. In the end I decided that it probably was going to take me more time to figure out what was wrong with X than it would to do a clean install and redo my VM setups. After a clean install X worked fine, I had some problems initially with the synaptics touchpad but that was just a matter of configuring it in xorg.conf.
I had a USB stick with what I thought was a copy of the old /etc directory plus my VM build scripts and the like. On closer inspection I found these files were a bit out of date. I did do a dd of the entire linux disk to a file on a NetBSD machine so I knew I could get the files back... the trick was how. I could have just put the image back onto my laptop disk but that would have been tedious, having to copy the data over, boot up the old OS, get what I wanted and then reinstall 6.2 again. Fortunately, the NetBSD machine with the dd image on it had enough room to have rhel image uncompressed. I though that what I could do is us vnconfig to create a virtual drive from the file like this:
vnconfig -c vnd0 ./dd_image_file
and then use lvm on NetBSD to access the logical volumes. Unfortunately it looks like the version of the NetBSD kernel I have on that machine is too old and it had issues with lvm. I knew that NetBSD on my laptop worked fine with lvm so what I ended up doing was NFS mounting the file system from the machine with the image (I didn't have enough room on the laptop to hold the disk image locally), then use vnconfig as above to create the virtual disk. I then did:
lvm vgscan
which found the lvm volgroups on vnd0, I then did:
lvm vgchange -a y
mount -t ext2fs /dev/mapper/volgroup-lvol00 /mnt
an there we have it. A disk image on one machine, nfs mounted to another, through a virtual disk driver into lvm. It is a bit tortuous (and a bit slow due to the NFS mount going over wireless) but I can access the files and pull off what I want without having to reinstall.
When I get organised I will do a write up about the lab in the lap.
I am going to do another course and exam soon but the course will be using RHEL 6.2 for the OS so I thought it would be good to update my install. Red Hat's recommendation is that you do a clean install between major versions but I really didn't want to lose my xen configuration so I attempted to perform an upgrade instead which you can do by hitting escape within the first 60 seconds of the cdrom boot and then adding:
linux upgradeany
to the boot command line. This forces the upgrade regardless of the version. The upgrade seemed to work ok but the first hurdle was the process had not written a new entry in grub.conf to boot the upgraded system. This was easy to fix, just boot up the cd in rescue mode and fix. I then tried to get X working properly but no matter what I did I would either get a black screen or X would come up and just hang. In the end I decided that it probably was going to take me more time to figure out what was wrong with X than it would to do a clean install and redo my VM setups. After a clean install X worked fine, I had some problems initially with the synaptics touchpad but that was just a matter of configuring it in xorg.conf.
I had a USB stick with what I thought was a copy of the old /etc directory plus my VM build scripts and the like. On closer inspection I found these files were a bit out of date. I did do a dd of the entire linux disk to a file on a NetBSD machine so I knew I could get the files back... the trick was how. I could have just put the image back onto my laptop disk but that would have been tedious, having to copy the data over, boot up the old OS, get what I wanted and then reinstall 6.2 again. Fortunately, the NetBSD machine with the dd image on it had enough room to have rhel image uncompressed. I though that what I could do is us vnconfig to create a virtual drive from the file like this:
vnconfig -c vnd0 ./dd_image_file
and then use lvm on NetBSD to access the logical volumes. Unfortunately it looks like the version of the NetBSD kernel I have on that machine is too old and it had issues with lvm. I knew that NetBSD on my laptop worked fine with lvm so what I ended up doing was NFS mounting the file system from the machine with the image (I didn't have enough room on the laptop to hold the disk image locally), then use vnconfig as above to create the virtual disk. I then did:
lvm vgscan
which found the lvm volgroups on vnd0, I then did:
lvm vgchange -a y
mount -t ext2fs /dev/mapper/volgroup-lvol00 /mnt
an there we have it. A disk image on one machine, nfs mounted to another, through a virtual disk driver into lvm. It is a bit tortuous (and a bit slow due to the NFS mount going over wireless) but I can access the files and pull off what I want without having to reinstall.
When I get organised I will do a write up about the lab in the lap.
Sunday, July 29, 2012
Managed to update the usbasp with new firmware. I used a scrap bit of veroboard to put a couple of headers on and wired them pin for pin to create a joiner from my programming cable to the usbasp cable. I loaded up a slightly modified avrisp sketch (changed the reset pin to be D0) onto the leostick. Oddly, I found that if I used one of the usb ports on my laptop for the leostick testing the set up by reading the firmware from the usbasp would randomly hang part way through. When I changed to another USB port the process worked flawlessly. Once I had read the old firmware I uploaded the new firmware to the usbasp which worked fine. Now there are no more messages from avrdude about not being able to set sclk when I upload sketches using the usbasp.
Tuesday, July 24, 2012
More playing with arduinos. I bought an Etherten which I plan to use for my data logger project. It has pretty much all I need on a single board apart from the wireless interface. I managed to source a board with a Nordic nRF905 on it for the wireless - this is the same chip used in the little monitoring display that was supplied with the solar inverter, it talks using SPI so there should be a good chance of getting it going with the arduino.
The first step was uploading sketches to the Etherten. This proved more difficult than I thought it would be after getting the leostick working. The Etherten steadfastly refused to be programmed in NetBSD (in Windows it was fine...). I tried updating the firmware for the usb->serial bridge chip by mostly following the instructions https://andrewmemory.wordpress.com/2011/04/14/upgrading-the-arduino-uno-8u2-using-flip/ though I just used the raw button to get the file down and had to hold the main reset down and then short the bridge chip reset to get it into programming mode. That didn't help me. Next I tried updating the bootloader using optiloader on the leostick, I had to patch the optiloader code:
The first step was uploading sketches to the Etherten. This proved more difficult than I thought it would be after getting the leostick working. The Etherten steadfastly refused to be programmed in NetBSD (in Windows it was fine...). I tried updating the firmware for the usb->serial bridge chip by mostly following the instructions https://andrewmemory.wordpress.com/2011/04/14/upgrading-the-arduino-uno-8u2-using-flip/ though I just used the raw button to get the file down and had to hold the main reset down and then short the bridge chip reset to get it into programming mode. That didn't help me. Next I tried updating the bootloader using optiloader on the leostick, I had to patch the optiloader code:
--- optiloader.ino 2012-07-18 20:53:16.000000000 +0930
+++ ../optiLoader.pde 2012-07-18 20:51:49.000000000 +0930
@@ -50,15 +50,6 @@
// 9 to power more complex Arduino boards that draw more than 40mA, such
// as the Arduino Uno Ethernet !
//
-// For a leostick the pins are:
-// 0: slave reset (arbitrary allocation)
-// 14: MISO
-// 15: SCK
-// 16: MOSI
-//
-// The pins are only available on the ICSP header, this header
-// also supplies the power to the other board too.
-//
// If the aim is to reprogram the bootloader in one Arduino using another
// Arudino as the programmer, you can just use jumpers between the connectors
// on the Arduino board. In this case, connect:
@@ -98,11 +89,11 @@
/*
* Pins to target
*/
-#define SCK 15
-#define MISO 14
-#define MOSI 16
-#define RESET 0
-#define POWER 1
+#define SCK 13
+#define MISO 12
+#define MOSI 11
+#define RESET 10
+#define POWER 9
// STK Definitions; we can still use these as return codes
#define STK_OK 0x10
This is due to the leostick not having the programming pins in the "normal" arduino place. I made up a cable and managed to program the Etherten bootloader but still no joy. While I was mucking around with this I also found an usb-asp on ebay which is an in circuit programmer for atmel chips which happens to be what the arduinos use. The programmer and a 10 to 6 pin adaptor came to less than $7 including postage so I bought those.
The good news is that the usb-asp works fine under NetBSD with the native avrdude, I can upload sketches to the Etherten no problems. Avrdude is complaining that the usbasp needs a firmware update which is another adventure. I have loaded up the AVRisp sketch onto the leostick but I need to make a cross-over adaptor to connect my cable to the usbasp cable, then I can update the firmware.
Friday, June 29, 2012
These are not the bugs you are looking for
There is something vaguely frustrating about finding and fixing a bug in some software thinking that you have found your problem and things will start working right only to find that the problem is still there - you found a bug but not the one you were searching for.
I had this today, I have been off and on trying to track down why aspell seg faults when NetBSD curses is used, the problem has the hallmarks of memory being overwritten. I built a version of libcurses with dmalloc and it was telling me that a boundary was being overrun. After a bit of digging I found that in __init_get_wch there was a memset used to clear a buffer but the size argument was way too big causing memset to stomp past the end of the array. I fixed this and dmalloc no longer complained when I ran my simple test code but aspell still seg faults and the stack backtrace from the core file still looks as mangled as it did before my fix. So, yes, I definitely fixed a bug - just not the one I was aiming to fix.
I had this today, I have been off and on trying to track down why aspell seg faults when NetBSD curses is used, the problem has the hallmarks of memory being overwritten. I built a version of libcurses with dmalloc and it was telling me that a boundary was being overrun. After a bit of digging I found that in __init_get_wch there was a memset used to clear a buffer but the size argument was way too big causing memset to stomp past the end of the array. I fixed this and dmalloc no longer complained when I ran my simple test code but aspell still seg faults and the stack backtrace from the core file still looks as mangled as it did before my fix. So, yes, I definitely fixed a bug - just not the one I was aiming to fix.
Wednesday, June 27, 2012
Been distracted fixing a scrolling bug in the menu library (libmenu). When I did the code I only really tested the code in "row major" mode where the menu items are laid out left to right in the desired number of columns, the other mode "column major" for want of a better term lays the menu items out going down the columns - in both cases the number of rows in a menu is determined by the number of items to display. You are supposed to be able to navigate around the menu items by using up/down/left/right commands (there are a lot of other commands too...). To make the navigation easier I pre-calculated the neighbours when the menu is being created and stored them in the item structure. Lots of edge cases, literally, doubly so because you can either have "cyclic" navigation which will wrap around the edges of the menu or "non-cyclic" where navigation stops at the edge of the menu. I had to totally rewrite the neighbour calculation and tidy up the menu drawing. Now it works as it should in both modes.
I have also played around with getting rescanning of the usb bus working. I am not sure if I am happy with what I have, you can rescan an attached device and it will be detached and reattached, you can rescan an entire bus but I am not sure if that does anything. More testing is required.
I have also played around with getting rescanning of the usb bus working. I am not sure if I am happy with what I have, you can rescan an attached device and it will be detached and reattached, you can rescan an entire bus but I am not sure if that does anything. More testing is required.
Thursday, June 14, 2012
The fix is in....
I have committed the changes to umodem (plus usb.h) that allow the driver to attach the serial port on the leostick, I expect this will work for other arduinos too - they should "just work" in NetBSD-current now. The problem of the drivers detaching and not reattaching when the arduino gets reprogrammed needs to be worked on still.
Tuesday, June 12, 2012
Arduino success
I have managed to upload an example sketch to the leostick, it took a bit of sleuthing to get all the bits right.
step 1: I used the boot configuration editor (boot -c) to disable the uhidev driver so that it would not claim the leostick, this meant that the ugen driver claimed it instead which is what I wanted. I already had a program that used the ugen ioctl USB_GET_CONFIG_DESC to get the configuration description of the device. It turned out the leostick had three interfaces, the first is a CDC ACM (usb serial) the second looks to be CDC Data but didn't make much sense and the third is the HID that uhidev was latching on to. Given that the NetBSD umodem driver supports the CDC ACM interface type it was a matter of working out why umodem was not attaching. It turns out that the umodem match was checking that the interface protocol attribute was set to a particular number, on the leostick the protocol was 0.
step 2: fix the umodem driver. I just modified the attach so that it would attach if the protocol was the AT protocol or if protocol is 0. Rebuilt the kernel, installed, rebooted and now when I plug the leostick in the umodem driver claims the serial port. I could connect to the serial port and see output from the leostick.
step 3: I built the blink sketch from the examples in the arduino IDE. I had built a native version of avrdude and put it in the right place in the arduino tree. I tried using the upload using programmer menu item but this errored out. I replaced avrdude with a script that logged the command line parameters and tried to manually run avrdude with the logged parameters but had the same errors. After a bit of fumbling around I found that some of the avrdude parameters weren't right for my environment plus I could only upload a sketch when the bootloader was active (the first 7 seconds after reset). So, I finally managed to upload a sketch by firstly pressing reset and then running the command:
/usr/pkg/bin/avrdude -C/home/user/blymn/arduino-1.0.1/hardware/tools/avrdude.conf -patmega32u4 -carduino -P/dev/ttyU1 -Uflash:w:/tmp/build1833870341762611968.tmp/Blink.cpp.hex:i -v
I was still stumped for a while because it seemed like I couldn't upload a sketch without doing a reset but adding the option to force the baud rate to 1200 would let me upload again without resetting:
/usr/pkg/bin/avrdude -C/home/user/blymn/arduino-1.0.1/hardware/tools/avrdude.conf -patmega32u4 -carduino -P/dev/ttyU1 -Uflash:w:/tmp/build1833870341762611968.tmp/Blink.cpp.hex:i -v -b 1200
The only problem now is that the NetBSD usb drivers detach from the leostick and don't reattach when a sketch is uploaded when a sketch is already running so I still need to reset the leostick to get the devices back... inconvenient.
step 1: I used the boot configuration editor (boot -c) to disable the uhidev driver so that it would not claim the leostick, this meant that the ugen driver claimed it instead which is what I wanted. I already had a program that used the ugen ioctl USB_GET_CONFIG_DESC to get the configuration description of the device. It turned out the leostick had three interfaces, the first is a CDC ACM (usb serial) the second looks to be CDC Data but didn't make much sense and the third is the HID that uhidev was latching on to. Given that the NetBSD umodem driver supports the CDC ACM interface type it was a matter of working out why umodem was not attaching. It turns out that the umodem match was checking that the interface protocol attribute was set to a particular number, on the leostick the protocol was 0.
step 2: fix the umodem driver. I just modified the attach so that it would attach if the protocol was the AT protocol or if protocol is 0. Rebuilt the kernel, installed, rebooted and now when I plug the leostick in the umodem driver claims the serial port. I could connect to the serial port and see output from the leostick.
step 3: I built the blink sketch from the examples in the arduino IDE. I had built a native version of avrdude and put it in the right place in the arduino tree. I tried using the upload using programmer menu item but this errored out. I replaced avrdude with a script that logged the command line parameters and tried to manually run avrdude with the logged parameters but had the same errors. After a bit of fumbling around I found that some of the avrdude parameters weren't right for my environment plus I could only upload a sketch when the bootloader was active (the first 7 seconds after reset). So, I finally managed to upload a sketch by firstly pressing reset and then running the command:
/usr/pkg/bin/avrdude -C/home/user/blymn/arduino-1.0.1/hardware/tools/avrdude.conf -patmega32u4 -carduino -P/dev/ttyU1 -Uflash:w:/tmp/build1833870341762611968.tmp/Blink.cpp.hex:i -v
I was still stumped for a while because it seemed like I couldn't upload a sketch without doing a reset but adding the option to force the baud rate to 1200 would let me upload again without resetting:
/usr/pkg/bin/avrdude -C/home/user/blymn/arduino-1.0.1/hardware/tools/avrdude.conf -patmega32u4 -carduino -P/dev/ttyU1 -Uflash:w:/tmp/build1833870341762611968.tmp/Blink.cpp.hex:i -v -b 1200
The only problem now is that the NetBSD usb drivers detach from the leostick and don't reattach when a sketch is uploaded when a sketch is already running so I still need to reset the leostick to get the devices back... inconvenient.
Sunday, June 10, 2012
Arduino on NetBSD
I have been toying with using an arduino board to create a low power data logger. Not surprisingly there is no pre-packaged arduino IDE for NetBSD but there is for Linux and NetBSD can run Linux binaries. I downloaded the 32bit arduino-1.0.1 software and installed that. When I tried to run the arduino app it told me I was missing java. My previous attempts at getting java and NetBSD to play nicely have not turned out well but I thought I may as well give it another go and downloaded java for 32bit Linux. I installed java and set my PATH to point at it and then ran arduino again. Much to my surprise the IDE actually runs and seems to mostly work apart from it steadfastly refusing to see any serial ports. I was able to build an example sketch but I cannot upload it to a board because of the serial port problem. The IDE gives you the option to upload using using a programmer in which they just call avrdude, this fails because libusb is missing. A quick look in NetBSD pkgsrc shows that avrdude is actually there so I can build a native version of this tool. In fact, now that I look it seems that all the "avr" tools required to crossbuild binaries for the arduino boards are available in pkgsrc. So I could build native tools and use a makefile to produce and upload the binaries. I think I will stick with just replacing avrdude for the moment.
As for a target... I have a leostick which was given away at the 2012 LinuxConf.au, I have tried plugging this into my laptop and it is attached as a keyboard and mouse which is not very helpful, no serial port is attached. Apparently this works on some later versions of linux, I guess due to changes in the CDC ACM driver, in NetBSD this seems to be handled by umodem but I haven't yet managed to get it to attach with this driver. We shall see how it goes.
As for a target... I have a leostick which was given away at the 2012 LinuxConf.au, I have tried plugging this into my laptop and it is attached as a keyboard and mouse which is not very helpful, no serial port is attached. Apparently this works on some later versions of linux, I guess due to changes in the CDC ACM driver, in NetBSD this seems to be handled by umodem but I haven't yet managed to get it to attach with this driver. We shall see how it goes.
Sunday, June 03, 2012
Oops
OK - This place was ignored for quite some time, lets see if we can do something regular from now on.
What has happened the last 4 and a bit years? Well, verifiedexec per page signatures is still not in the tree. I merged the bottom end of the uvm routines but there was push-back on how I had done this from others, citing the long function argument list as being not a good thing. I have the actual code working, not only uvm but also the per-page verifiedexec stuff. I wrote a simple test that consisted of a binary that was stored on a NFS share - the binary just calls a couple of functions located in different memory pages (I inserted a bunch of unused code to get the functions on different pages). There are two versions of the binary the "good" version which has a matching veriexec signature and is the one initially run and the "evil" version that has a modified second page. The test consisted of starting the "good" binary and checking both function calls are the good ones, then on the NFS server overwrite the "good" binary with the "evil" one, once this is done the resident executable pages were flushed using msync (this is just a short cut you could force resource starvation). With an unmodified veriexec, the "evil" function would be executed, with the per-page modification the binary was terminated as soon as the pager attempted to bring in the modified page. This is good but I really need to fix up the UVM modifications to make them less convoluted.
Part of the reason veriexec hasn't progressed much in the intervening time is that I started working on automatically testing libcurses. This is quite a complex thing to do because curses expects to talk to a terminal. I developed a test framework that takes a simple list of commands and runs them against a test program and then verifies the output matches what is expected. This framework works with the NetBSD ATF (Automated Test Framework) and has been committed to the tree along with a small set of tests. More tests are on the way, there are an awful lot to write and it is a slow process but definitely worth it as a number of previously unreported bugs have been found and fixed.
I talked about both veriexec and the curses testing at BSDCan 2012, the actual talks are here for veriexec and here for curses testing there are papers and the slides from the presentations available at the aforementioned links.
What has happened the last 4 and a bit years? Well, verifiedexec per page signatures is still not in the tree. I merged the bottom end of the uvm routines but there was push-back on how I had done this from others, citing the long function argument list as being not a good thing. I have the actual code working, not only uvm but also the per-page verifiedexec stuff. I wrote a simple test that consisted of a binary that was stored on a NFS share - the binary just calls a couple of functions located in different memory pages (I inserted a bunch of unused code to get the functions on different pages). There are two versions of the binary the "good" version which has a matching veriexec signature and is the one initially run and the "evil" version that has a modified second page. The test consisted of starting the "good" binary and checking both function calls are the good ones, then on the NFS server overwrite the "good" binary with the "evil" one, once this is done the resident executable pages were flushed using msync (this is just a short cut you could force resource starvation). With an unmodified veriexec, the "evil" function would be executed, with the per-page modification the binary was terminated as soon as the pager attempted to bring in the modified page. This is good but I really need to fix up the UVM modifications to make them less convoluted.
Part of the reason veriexec hasn't progressed much in the intervening time is that I started working on automatically testing libcurses. This is quite a complex thing to do because curses expects to talk to a terminal. I developed a test framework that takes a simple list of commands and runs them against a test program and then verifies the output matches what is expected. This framework works with the NetBSD ATF (Automated Test Framework) and has been committed to the tree along with a small set of tests. More tests are on the way, there are an awful lot to write and it is a slow process but definitely worth it as a number of previously unreported bugs have been found and fixed.
I talked about both veriexec and the curses testing at BSDCan 2012, the actual talks are here for veriexec and here for curses testing there are papers and the slides from the presentations available at the aforementioned links.
Subscribe to:
Posts (Atom)