Been fooling around a lot with Destiny API lately. It’s been an experiment in strangeness, to say the least.  It has been a little difficult to work with, but only because at my office (where I do my thinking) the API Documentation is blocked (because Games are the devil at schools) which makes it tricky to figure out what data I need to provide in order to get the data I want (like having to provide a Platform ID and a Character ID in order to get number of kills).

That being said, it’s a learning exercise for objects and classes in PHP, which is fun.  It’s certainly a good distraction from the clusterfuck that is American Politics today.

Continue reading

At work, we have a bunch of different models of devices.  We actively maintain no less than 3 separate models of Desktops (1 for the K-3, 1 for 4-5, 1 for 6-8, 2 for 9-12 but these are usually a mix of the 4-5 and 6-8) and no less than 4 different Mobile devices (4 models of Chromebooks, 3 models of Laptops).  Most of the Mobile devices, including the Chromebooks we get from Dell and Samsung are pretty easy to work on.

This is not the case for the Dell Laptop model Precision M2800.  They are a bear.

I’ll tell you why.

Continue reading

We just received new UPS units (Battery Backups) for our office since we got new ESXi infrastructure (Cisco Hyperflex, which is baller frankly) and the power requirements changed as a result of that new infrastructure. Plus, the old batteries were starting to die in mass quantities (after not being replaced for 3+ years…) which was leading to headaches up to and including loss of power in the datacenter. Not cool.

Cisco Hyperflex

Cisco Hyperflex

APC SmartUPS X 3000

APC SmartUPS X 3000

These new SmartUPS X 3000 are VERY cool though.


We just finished configuring the management interfaces for them, and that got me to thinking: running ping checks on these is all well and good but what kind of data can we pull from them via SNMP.  If it’s on the network I can grab data from it, that’s my story and I’m sticking to it.

Thankfully the management cards support SNMP v1 through v3, and the configuration of it is easy enough.  That’s an exercise for the reader, if you can’t figure out the 3 clicks it takes then the rest of this will probably be way over your head.

So I set out to write my first set of truly self-written monitoring templates.  It was surprisingly easy once I started to read the documentation for Zabbix 3.0.

It’s got three “Applications” with different items, about 15 total:

  • UPS Connectivity
    • Ping
    • Packet Loss
    • Response Time
  • UPS Information
    • Battery Installation Date
    • Firmware Revision
    • Serial Number
    • System Model
  • UPS Status
    • Battery Remaining Capacity (in %, the charge left on the battery)
    • Battery Run Time Remaining (in minutes, the time until bad things happen)
    • Battery Supplying Voltage (in V, the voltage the battery is supplying to the battery backup system)
    • Battery Temperature (in C, the temperature of the battery)
    • Input Frequency (in Hz, to check the quality of the power)
    • Input Voltage (in V, to check the quality of inbound power)
    • Output Frequency (in Hz, to check how the UPS regulators are working)
    • Output Voltage (in V, to check how the UPS regulators are working)

It has a bunch of triggers too.

  • Battery Remaining Capacity 1% (Disaster Alert)
  • Battery Remaining Capacity 10% (High Alert)
  • Battery Remaining Capacity 25% (Average Alert)
  • Battery Temperature Excessive >36C (High Alert)
  • Battery Temperature High >30C, <36C (Average Alert)
  • Inbound Power Quality Warning (+/-20 V over spec, +/-10Hz over spec, while power still being providing (>0 V input))
  • Inbound Power Failure (0V being supplied)
  • Outbound Power Quality Warning (+/-20 V over spec, +/-10Hz over spec, while power still being providing (>0 V output))
  • Packet Loss
  • Response Time
  • Network Unavailable

All the triggers depend on the network being available, so the dependency is set there.  Additionally, Inbound Power Quality Warning depends on Inbound Power Failure.  I do not like duplicate alerts.  These were fairly interesting to write up and the expression constructor was VERY helpful, as was the expression tester.

Expression Constructor

Expression Constructor

Expression Tester

Expression Tester

Here is the Zabbix template, in case you’d like to use it. 🙂

smart-ups-x3000-monitoring

Here’s a screenshot of the monitoring so far:

UPS Monitoring Latest Data

UPS Monitoring Latest Data

Cheers, and may your Zabbix instance alerts be few!

-M, out

What a bunch of days.  On October 6th at around 3:20am I got an email from JetPack informing me that my site was down.  I was asleep at this time.

I woke up, saw it, and saw a follow-up from JetPack.  I immediately assumed it was the “Everything is OK now” email.  It was not.  As I started about my day I made my coffee I checked and saw that my site was still down.  Both of my sites were down, in fact.  They’re both on the same Digital Ocean droplet.  Greatttttttt.  I was already running on nil sleep, and it was going to be a day.  Thursday wanted to put up a fight.

So I log into Digital Ocean since I can’t ping or SSH into my host and find that it is, in fact, running.  Crap.  If it was off that’d have been a simple solution.

This was going to be anything but simple.

I used the Console connect utility and ran ifconfig.

Horror.

All I saw was lo, the loopback interface.

What.  The.  Hell.

I ran uptime.  6 hours.

Double What The Hell.

A little more digging and I saw some feedback from lshw -c net: I had 2 new network adapters: ens3 and ens4.

Weird.

I edited my network config file (/etc/network/interfaces) and renamed eth0 instances to ens4.  No dice.  Still no networking.

I then re-edited my network config file and renamed the ens4 to ens3.  This fixed it.  I don’t know what the hell happened, but I was confused as hell.  Total down-time: approximately 9.5 hours.  Thank goodness this isn’t a really super serious server.  Blah.

Now that I was sitting down with a working box, I decided to delve deeper.  Around 3:15am I saw a random “reboot” command issued to my droplet via the console.  I looked through the logins, thinking that SOMEHOW someone got access to my account.  No logins except for my IP address.

SOMEONE (or something) at Digital Ocean rebooted my droplet.

Why this reboot caused eth0 to become ens3 I do not know.

It has something to do with Ubuntu distro upgrading, but I upgraded to 16.04.1 a few weeks ago.

Super frustrating.

So: if you have a VM and find that you have lost network connection and your network adapters are missing: check the output of  lshw -c net and verify that the interface is named what you think it should be.

-M, out

Back to our regularly scheduled programming.  I’ve written a lot of not-quite-technical posts in the past few weeks.  I know I did this week (because the gas tax has me furious).  All that being said, I decided to make a right-proper one this time because I’ve been toying around with this project at work and information is pretty slim because it’s out of date.  We needed a web server.  A small web server.  Apache, PHP, MySQL.  PhpMyAdmin to make part of the project super easy.

Well, the tiny part was easy.  Damn Small Linux.  Base install less than half a gig.

Adding Apache, PHP, MySQL, and PhpMyAdmin: not so much.  All the instructions were hand-wavy and the newest installer scripts don’t work on the size of disk I wanted.

So I present to you: Linux, Apache, PHP, MySQL, PhpMyAdmin: <768MB total install size.Continue reading

I was having difficulty coming up with something to write this week. In fairness, I’ve been distracted by car woes (and work woes, and woes in general) and my time has been largely occupied. I was submitting another request to MakerBot for an update to a subject when a topic came to me: Help-Desk Auto-Reply Messages.

I’ve been entering tickets with SolarWinds (for Web Help Desk) and Makerbot (for a Replicator 5th Gen) for over a week to get a few issues resolved.  My frustration initially came from SolarWinds.  I entered the support ticket and immediately got a receipt of the ticket in my inbox.  In fairness, I did indicate that the ticket wasn’t a high-priority, nor a rush.  That being said, I don’t think it’s unreasonable to expect a non-automated answer within a business day.  I entered the ticket on Sept 23 @ 12:11pm.  The ticket stated:

Hello,

We deployed the Solarwinds Linux OVA file to our ESXi infrastructure several years ago. It is running VM Version 7. We recently upgraded to a new ESXi infrastructure. In order to do Snapshots we need to upgrade to VM Version 11.

Is there any problem with upgrading to VM Version 11?

Please let us know. There is no rush.

On Sept 27th at 1:40pm I entered a comment on the ticket asking for an update as to whether or not they were even looking into it.  At this point, I had not received any email other than the automated email.

Finally, on Sept 28th at 8:01pm I received a reply from a tech answering the question for me (hint: it’s ok and I don’t need to be so paranoid about it).

All that being said the time from ticket entry to first response was low if you count the initial “We got your ticket” email that is now automated by 90% of ticketing systems.  The time from ticket entry to first helpful response was absurdly high (over 3 business days).  Again, granted, I said no rush, but some non-automated contact over 5 days (3 business days) is absurd.  Even a simple “I’m looking into this for you” would have kept me appeased.  An automated response: not so much.

Let’s try and keep this in mind when we work on ticketing solutions.  An automated email does not (or should not) count as customer contact.  If you’re including it in your metrics (which some places do, oddly enough) then you’re probably using poor metrics.

I don’t like metrics to begin with, but if you’re going to use them (and I know you will) then some useful ones are:

  • Time from ticket entry to time of first human-contact.
  • Time from ticket entry to time of first feedback (request for more information, request to try steps for a solution) which may be the same as above.
  • Time from ticket entry to time of resolution.
  • Active time spent on the ticket (if you can measure it).
  • Customer feedback on support job.

These seem like the most valuable metrics to me, especially #2, #3, and #5.

For example on a scale of 1-5 (1 being worst, 5 being best) when dealing with Solarwinds:

#1, 2: 1 [Took way too long to hear anything from a human]
#3: 2 [Resolution was achieved quickly and easily]
#4: 5 [Time spent to fix it was minimal]
#5: 3 [Pretty much average considering]

That’s pretty dismal numbers in my book.

MakerBot had different issues. I’ve had two different tickets with them.

The first ticket was to register new warranty to the 3 MakerBot devices.

On the same 1-5 scale for this ticket:

#1, #2: 1 [I heard from them within the day with what they needed from me to get the steps done]
#3: 5 [The full resolution took over 2 weeks due to delays in registering the warranties]
#4: 5 [A lot of waiting time and the system kept trying to auto-close the tickets]
#5: 4 [Dismal]

The second ticket was to get actual support for one of the 3 MakerBot devices.

#1, #2: 1 [I heard from them within the day, they gave me diagnostic steps and information to check]
#3: 2 [Within 2 days the diagnosis was confirmed and parts were shipped]
#4: 2 [2 days is not as great as 1, but well within the overall time frame]
#5: 2 [Good job overall!]

Of course, just my 2 cents.

-M, out

This week some of you may have heard some bits of utter absurdity regarding a certain manufacturer of laptops and desktops in the news. Yep, once again I’m talking to you about Lenovo. This hurts because I’m a big fan of their hardware. I have a Lenovo Y550p and a U530. I love both of them, they’re workhorses for mobile productivity.

That being said, Lenovo has had more than their fair share of scandals. The most disturbing of which being the man-in-the-middle exploit certificate they were installing on equipment as part of a factory image. Naughty, naughty Lenovo.  There’s an article about this here.

Early this week we heard news of Lenovo’s new laptop line being so locked down that you wouldn’t be able to install Linux on it. These rumors were quickly confirmed as mostly-true: you -can- run Linux so long as it boots UEFI (most do these days). You cannot (easily) install it though because the UEFI settings force the drives into a hardware raid that there is no Linux driver for (yet). That means you’re relegated to Windows and Live-CD booting of Linux. Sad trombone.  I was initially turned on to this story via a Reddit thread here.

Lenovo was quick to say that it’s a factor of their agreement with Microsoft to sell it as a Microsoft certified device. They claimed that it needed to be a Windows only device or it would not be certifiable. Microsoft quickly chimed in and said: no way, that’s not part of our requirements.

Hanlon’s Razor teaches: “Never attribute to malice that which can be adequately explained by stupidity.”. Whether it was malicious intent on Lenovo’s part, or stupidity on the CSR’s part is yet to be seen.

It’s frustrating to see from the sidelines though. At some point Lenovo should release a driver (hopefully not a binary blob) for the RAID which will make Linux happy. If they don’t, I’m sure SOMEONE will.

This is a scary direction for devices, OS designers, and technology in general to be going. I stick with custom-built PCs (outside of laptops and smartphones of course, though I anxiously await the arrival of the first modular smartphones) specifically for the flexibility that this allows. I am not bound to buying a super high end device if all I need is a machine with a beefy video card. Conversely if I need to crunch big numbers I do not need to buy a box with a massive video card, just lots of RAM and a fast CPU. There is no such flexibility for vendors to provide these features. It’s more economical to provide 3-4 base models with minimal modularity.

My fear is that PC land will rapidly approach Apple-esque levels of lockdown: you cannot run (easily) anything other than an Apple OS. You can install Windows but as far as I know you need to install Apple OS first and initiate the install via their tools.

I think hardware designers, manufacturers, re-sellers and OS designers need to take a clear step back and evaluate what they’re potentially doing to tyevPC ecosystem, before it is too late.

-M, out.

Not in the method you might expect.

I’m talking about realizing when you’re capable of doing work that people want to charge you for, and thus doing it yourself in order to be more cost efficient.

As with most things: there’s a story here.

On Monday 9/5 I started hearing a loud screeching from my car when I was driving.  It’s happened before, so I ignored it until I got to work, figuring it would go away.  By the time I drove home from work that day, the sound had largely subsided.

On 9/6, the sound came back.  This cat and mouse game continued until I got fed up and called the dealer on 9/12 and said I’d like them to do some diagnostic service.  They agreed, and scheduled me for 9/15.  On 9/13 my car decided to all but quit on me.  Thankfully, I was able to get it to the dealer.

They came back and said:

Here’s the deal: to get you out the door, you need new belts.  Fine, do it, I need to get out the door.

That being said: Your AC system needs a recharge (I agreed) and dye (I agreed again).   They wanted to do brakes and rotors for the front tires.  I already knew they were needed, and the parts were already ordered.  I passed.

Then comes the real kicker: the ball joint on my drivers side is shot and needs replacement ASAP.  They quote me $500.  I balk.  With good reason.

A 5 minute google search (and RockAuto search) reveals the part is $40 shipped, and is all of 3 bolts.  I balked for a reason.  I can handle 3 bolts.  My guess is you can too.  The process is:

  1. Park car and put on parking brake.
  2. Jack car up on one side.
  3. Remove wheel.
  4. Loosen bolt attaching control arm ball joint to the wheel hub.
  5. Loosen remaining two bolts.
  6. Remove.
  7. Reinstall new part in opposite order.
  8. Torque bolts to manufacture specs.

I balked hard, because this to me seems like… An hours worth of work for ME, with hand tools, and no lift.  For them? 20 minutes.  Tops.

The part (for them) is ~$150.  The Alignment is $100.  That means labor and shop fees was $250.  For 20 minutes of work.

Seriously.

I went to Rock Auto and bought both driver and passenger control arms for $80 shipped.

I got them in a day.

I went and bought impact tools to make my life easy.

So now here’s the cost/comparison.

Dealer:

  • Control Arm, Driver’s Side
    • $395.00
  • Control Arm, Passenger’s Side
    • $395.00
  • Alignment
    • $100.00
  • AC Recharge + Dye
    • $139.00
  • Brakes & Rotors, Driver’s and Passenger’s Side
    • $375.00
  • Belts
    • $199.00
  • Total:
    • $1,603.00
  • Total, with Tax:
    • $1,715.21

What I paid:

  • Dealer:
    • AC Recharge + Dye
      • $139.00
    • Belts
      • $199.00
    • Total:
      • $338.00
    • Total, with Tax:
      • $352.08
  • Control Arm (Driver & Passenger), Brakes & Rotors (Driver & Passenger)
    • $238.07
  • Impact Socket Set from Harbor Freight
    • $48.11
  • Craftsman Impact Wrench Set
    • $70.22
  • 5-Year Alignment (PepBoys)
    • $125.00
  • Total:
    • $833.48

Savings: $1,715.21 – $833.48 = … $881.73

I saved over half.  OVER.  Half.

Yeah, my time is involved in my way, but honestly: assuming it takes me an hour to do each control arm, an hour to do the brakes, and an hour in waiting for the alignment: That’s $200 an hour I’m paying myself.

And that is damn impressive.

Plus, I get to keep the tools! 😀

How I’m Feeling: