Ceph 10.2/Jewel and Xen on Debian Jessie

When using rbd as backend and updating to ceph Jewel, make sure you install qemu-system-x86, qemu-utils and qemu-block-extra from backports (version 2.5), else you might experience virtual machines booting without attached storage.

In the /var/log/xen/qemu-dm-.log zu will find errors like these:

xen be core: xen be: watching backend path (backend/console/3) failed
xen be core: xen be: watching backend path (backend/vkbd/3) failed
xen be core: xen be: watching backend path (backend/vfb/3) failed
xen be core: xen be: watching backend path (backend/qdisk/3) failed
xen be core: xen be: watching backend path (backend/qnic/3) failed


The website was offline fr a few hours several times in the past weeks. The mainboard of the server has been replaced now, so I hope the service is stable again.

Update 19.02: Today a harddisk failed an needed replacement…

Cronjob to enable timed deep scrubbing in a ceph-cluster

If you have a ceph-cluster that has deterministic IO-patterns, you might want to time the deep-scrubbing during a period with lower IO activity. Probesys had published a script that could run from cron and deep-scrub the oldest n percent of the PGs.

Unfortunately with newer ceph versions the output of some commands has changed. So I did some adjustments to make it work again. Also I made it a bit less verbose, so it would be better for a cronjob. Here is the result:


set -o nounset 
set -o errexit


#What string does match a deep scrubing state in ceph pg's output?
#Max concurrent deep scrubs operations

#Set work ratio from first arg; fall back to '7'.
[ "x$workratio" == x ] && workratio=7

function isNewerThan() {
    # Args: [PG] [TIMESTAMP]
    # Output: None
    # Returns: 0 if changed; 1 otherwise
    # Desc: Check if a placement group "PG" deep scrub stamp has changed 
    # (i.e != "TIMESTAMP")
    ndate=$($CEPH pg $pg query -f json-pretty | \
        $PYTHON -c 'import json;import sys; print json.loads(sys.stdin.read())["info"]["stats"]["last_deep_scrub_stamp"]')
    nts=$($DATE -d "$ndate" +%s)
    [ $ots -ne $nts ] && return 0
    return 1

function scrubbingCount() {
    # Args: None
    # Output: int
    # Returns: 0
    # Desc: Outputs concurent deep scrubbing tasks.
    cnt=$($CEPH -s | $GREP $DEEPMARK | $AWK '{ print $1; }')
    [ "x$cnt" == x ] && cnt=0
    echo $cnt
    return 0

function waitForScrubSlot() {
    # Args: None
    # Output: Informative text
    # Returns: true
    # Desc: Idle loop waiting for a free deepscrub slot.
    while [ $(scrubbingCount) -ge $MAXSCRUBS ]; do
        sleep 1
    return 0

function deepScrubPg() {
    # Args: [PG]
    # Output: Informative text
    # Return: 0 when PG is effectively deep scrubing
    # Desc: Start a PG "PG" deep-scrub
    $CEPH pg deep-scrub $1 >& /dev/null
    #Must sleep as ceph does not immediately start scrubbing
    #So we wait until wanted PG effectively goes into deep scrubbing state...
    local emergencyCounter=0
    while ! $CEPH pg $1 query | $GREP state | $GREP -q $DEEPMARK; do
        isNewerThan $1 $2 && break
        test $emergencyCounter -gt 150 && break
        sleep 1
        emergencyCounter=$[ $emergencyCounter +1 ]
    sleep 2
    return 0

function getOldestScrubs() {
    # Args: [num_res]
    # Output: [num_res] PG ids
    # Return: 0
    # Desc: Get the "num_res" oldest deep-scrubbed PGs
    [ x$numres == x ] && numres=20
    $CEPH pg dump pgs 2>/dev/null | \
        $AWK '/^[0-9]+\.[0-9a-z]+/ { if($10 == "active+clean") {  print $1,$23,$24 ; }; }' | \
        while read line; do set $line; echo $1 $($DATE -d "$2 $3" +%s); done | \
        $SORT -n -k2  | \
        $HEAD -n $numres
    return 0

function getPgCount() {
    # Args:
    # Output: number of total PGs
    # Desc: Output the total number of "active+clean" PGs
    $CEPH pg stat | $SED 's/^.* \([0-9]\+\) active+clean[^+].*/\1/g'

#Get PG count
#Get the number of PGs we'll be working on
pgwork=$((pgcnt / workratio + 1))

#Actual work starts here, quite self-explanatory.
logger -t ceph_scrub "About to scrub 1/${workratio} of $pgcnt PGs = $pgwork PGs to scrub"
getOldestScrubs $pgwork | while read line; do
    set $line
    deepScrubPg $1 $2

logger -t ceph_scrub "Finished batch"

Using ceph rbd as Xen backend

Since I have googled a lot to find the relevant configuration line, here the result: (assuming a running ceph installation and Xen >= 4.3)

disk = [ 'format=raw, vdev=xvda1, access=rw,backendtype=qdisk, target=rbd:<pool-name>/<image-name>:id=<cephx-Id>' ]

Spring cleaning

I just dusted the map building process. Now the used programs (mkgmap and splitter) are the newest versions available, so we can benefit from the improvements made there over the winter.

I wish a pleasant bicycle ride.